Skip to main content

Notifications

Power Apps - AI Builder
Unanswered

Table extraction from PDF where row data is long

(0) ShareShare
ReportReport
Posted on by 30

Hi,

 

We have a Editable PDF which has a static table with fixed width, rows and columns. We only need to capture the material description. The model works well in cases when the material description is short and fits in a single line.

When the description is long  it wraps in the same row in two lines. The AI builder recognizes this as two rows instead of one.

I have a ticket open with Microsoft (TrackingID#2109290060003230)and it states that this is a known bug and is being worked upon.

Anyone else face the issue?

Categories:
  • rahullakshmanan Profile Picture
    rahullakshmanan 30 on at
    Re: Table extraction from PDF where row data is long

    Hi Joe,

     

    I have replied back to you  via private message about how to recreate the issue and also shared a private preview link which helped fix the issue

  • JoeF-MSFT Profile Picture
    JoeF-MSFT on at
    Re: Table extraction from PDF where row data is long

    Hi again! 

     

    I've tested the documents, and the table extraction results seem correct. 🙂 I've sent you a private message to share how I've done the tagging, and keep investigating as we discuss the specifics of your documents.

  • rahullakshmanan Profile Picture
    rahullakshmanan 30 on at
    Re: Table extraction from PDF where row data is long

    Thank you for your prompt response. Looking forward to hearing from you

  • JoeF-MSFT Profile Picture
    JoeF-MSFT on at
    Re: Table extraction from PDF where row data is long

    Quick update: I got the details from the support ticket and will run some tests and let you know. Thanks for the patience! 

  • JoeF-MSFT Profile Picture
    JoeF-MSFT on at
    Re: Table extraction from PDF where row data is long

    Thanks! Let me get the details on the support ticket and I'll report back. 

  • rahullakshmanan Profile Picture
    rahullakshmanan 30 on at
    Re: Table extraction from PDF where row data is long

    @JoeF-MSFT i have used 20 PDFs to train the model and the same issue was  also reproduced by Microsoft Support

  • JoeF-MSFT Profile Picture
    JoeF-MSFT on at
    Re: Table extraction from PDF where row data is long

    Hi @rahullakshmanan,

     

    Thanks for your question.

     

    How many documents did you use for training? The more samples you can provide for training where the 'material description' is long, the better the model will get to learn how to recognize rows. 

Under review

Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.

Helpful resources

Quick Links

Microsoft Kickstarter Events…

Register for Microsoft Kickstarter Events…

Announcing Our 2025 Season 1 Super Users!

A new season of Super Users has arrived, and we are so grateful for the daily…

Announcing Forum Attachment Improvements!

We're excited to announce that attachments for replies in forums and improved…

Leaderboard

#1
WarrenBelz Profile Picture

WarrenBelz 145,304

#2
RandyHayes Profile Picture

RandyHayes 76,287

#3
Pstork1 Profile Picture

Pstork1 64,703

Leaderboard