web
You’re offline. This is a read only version of the page.
close
Skip to main content

Notifications

Announcements

Community site session details

Community site session details

Session Id :
Power Platform Community / Forums / Power Apps / AI Builder: Extract co...
Power Apps
Unanswered

AI Builder: Extract content from multipage tables in document processing is working for 3 pages afterwards it stops working

(1) ShareShare
ReportReport
Posted on by 28

Hello,

 

I created a custom Document Processing AI Builder model. The layout we want to capture is an Order Confirmation from our supplier towards us.
This model includes some header fields (e.g. supplier, date, references...) and an Items table (e.g. item, quantity, delivery date, price ...)


We used the following feature (Extract content from multipage tables in document processing | Microsoft Learn)

However only 3 pages get correctly processed. (So the data from page 1 / 2 / 3 get processed correctly, page 4 and following aren't captured)

 


The following info is available on the page I posted:

Feature details

An experimental feature currently exists in AI Builder that allows you to extract content from multipage tables. The feature is limited and doesn't always detect tables accurately if they span more than two or three pages.

 

I think this is the problem I have. Is there a timespan for solving this issue?

(Extra: AI Builder document processing isn't extracting tables that span across multiple pages | Microsoft Learn)

 

Regards,

Bram

Categories:
I have the same question (0)
  • JoeF-MSFT Profile Picture
    on at

    Hi @BramVeldeman - thanks for letting us know about this.

     

    How many documents did you use for training? Those training documents, how many documents there are where the table spans more than 3 pages?

     

    Can you add more sample documents for training where the table spans more than 3 pages? This will help the model better learn.

     

    Let us know if this helps. 

  • Bram Veldeman Profile Picture
    28 on at

    Hello, I used 7 documents, 3 document have > 3 pages. 4 documents only 1 page.

    When I test only the first 3 pages get captured. AI Builder doesn't capture the other ones.

    I will try to add some extra documents with > 3 pages. 

     

    Regards,

    Bram

  • Bram Veldeman Profile Picture
    28 on at

    Hello @JoeF-MSFT 

    My colleague and I each trained an own collection with similar results.

    Collection 1:

    • 13 documents
      • 1x 1 Page
      • 3x 2 Pages
      • 3x 3 Pages
      • 3x 4 Pages
      • 1x 6 Pages
      • 2x 8 Pages

    Collection 2:

    • 16 documents
      • 4x 1 Page
      • 2x 4 Pages
      • 1x 5 Pages
      • 2x 6 Pages
      • 4x 7 Pages
      • 2x 8 Pages
      • 1x 9 Pages

     

    Once a document passes 3 pages the performances / results of AI builder aren't trustworthy for the following pages. Page 1 - 3 gets captured almost perfect. After page 3 the models doesn't recognizes certain fields, or whole pages...

    Could we contact you to have a look at this?

    Thanks in advance and Kind regards,

    Bram 

  • JoeF-MSFT Profile Picture
    on at

    Hi Bram - could you do the following test?

     

    1. Edit your existing model. 

    2. Select Unstructured documents as document type.

    3. Retrain the model. 

     

    After the model is trained, do you get better results?

     

    JoeFMSFT_2-1669142566299.png

     

  • Bram Veldeman Profile Picture
    28 on at

    Hello @JoeF-MSFT 

    our findings:

    the performance of the model when testing has increased dramatically. Sometimes it makes mistakes with the pages at the end but it's a lot better than before.

    I Don't get it to be honnest.. We have structured documents when I look at the examples so why did I have to choose for unstructured documents? Is this logic?

    Thanks for the help and kind regards,

    Bram

  • Bram Veldeman Profile Picture
    28 on at

    Hello another remark for the product team.

    I have read that when exporting an AI model and importing into another environment only the Model is exported and not the training data. Therefore it isn't possible to edit the Model in the new environment. 

    As we started in a test environment this is quite a pitty as we lose a couple of days by recreating our collections..
    This could be improved.


    Kind regards,
    Bram


  • JoeF-MSFT Profile Picture
    on at

    Hi @BramVeldeman - thanks for sharing the outcomes of using the unstructured document option. Under the hood it uses a newer AI technology that works very well for unstructured documents but also for structured. Currently it works best for documents in English and it's not available in all Power Automate regions. In a future AI Builder update, once we support more languages and regions, we will most certainty just use this new AI model for both structured and unstructured documents. 

     

    Thanks for the feedback as well on moving training data across environments - this is something that it's also planned in an upcoming AI Builder update. 

  • Bram Veldeman Profile Picture
    28 on at

    Hello @JoeF-MSFT thanks for the feedback - now it makes more sense. We keep op testing and hope our Proof Of Concept works out the way we planned 👍

    Another option that my colleague and I discussed today that would be a big improvement in our eyes: 
    The possibility to train the model while testing the model.

    ideal flow in our eyes. 

    1) I use 5 documents to train the initial model

    2) afterwards I upload a test document and notice a couple of mistakes (e.g. 20% is wrong) > I correct them and add the tested document + corrected data to the model

    3) retrain the model based on 6 documents

     

    Now it's quite time consuming for some of our vendors to train the tables with multi page documents (+ not so structured tables that require advanced tagging). It would be a great time saver if it would be possible to add a (corrected) test document to a collection so not every single field has to be re-tagged every single time. 

    Kind regards,
    Bram

  • JoeF-MSFT Profile Picture
    on at

    Thanks @BramVeldeman - this is fantastic feedback! 🙂 It goes in the direction we're working towards: making it easier and more delightful to build document processing automation solutions. 

     

    Have you tried the new feedback loop feature? Continuously improve your model (preview) - AI Builder | Microsoft Learn You can add condition in your flow where if some quality conditions are not met, the document is sent back to the training set. 

  • Bram Veldeman Profile Picture
    28 on at

    Hello Joe,

     

    Will have a look at that feature. Thanks.

    One remark I have after switching from structured to unstructured documents.
    We have 2 collections. However, since switching from structured to unstructured the response always give back the same layoutName. In the past he returned the correct layoutName according to the used document. The document itself get processed correct so I think this isn't correct as the layouts aren't the same from the 2 Vendors. Could this be a bug?

    used "layoutName\":\"XXXXX\",\"layoutConfidenceScore\":0.9869999885559082}

    Regards,
    Bram

Under review

Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.

Helpful resources

Quick Links

Forum hierarchy changes are complete!

In our never-ending quest to improve we are simplifying the forum hierarchy…

Ajay Kumar Gannamaneni – Community Spotlight

We are honored to recognize Ajay Kumar Gannamaneni as our Community Spotlight for December…

Leaderboard > Power Apps

#1
WarrenBelz Profile Picture

WarrenBelz 721 Most Valuable Professional

#2
Michael E. Gernaey Profile Picture

Michael E. Gernaey 320 Super User 2025 Season 2

#3
Power Platform 1919 Profile Picture

Power Platform 1919 268

Last 30 days Overall leaderboard