web
You’re offline. This is a read only version of the page.
close
Skip to main content

Notifications

Announcements

Community site session details

Community site session details

Session Id :
Power Platform Community / Forums / Power Automate / Training Failures on C...
Power Automate
Unanswered

Training Failures on Custom Document Processing Model

(1) ShareShare
ReportReport
Posted on by 12
Hello colleagues! I'm attempting to customize a Document Processing model, training it to read a monthly invoice from our cell carrier. The document consists of up to 30 pages of a single table of 15 columns and 12-15 rows, depending on the page. There's various data types, but I've assigned them all "text". The table has no headers useful for training. Because I only have two paper examples of the invoice to work with, and I need a minimum of five documents, I've split the two documents into 11 PDFs of different page counts, with the same first and last pages, to simulate the needed number of training docs. Maybe 90 pages total. The tables are built exactly the same way on all pages, except page 1, which has some minor differences. I've created a table with headers in the model and have spent hours building the table (tagging) on each page, selecting the "Table continues on next page" option during the tagging process. I've checked each document to make sure they're accepting the table as I expected. Everything looks good.
 
After three attempts and much cursing, the results have been disappointing. Using the "Quick Test," it appears the model is correctly identifying the table boundaries, columns, and rows. Starting on page three or four, the model starts to lose the thread, misaligning the table boundaries, failing to read cell data in whole or in part, dropping columns, adding rows, and so on.
 
Any explanation for this behavior? Do I need more training docs? A different approach altogether? I've thought of building a table for each page, but that's complex and unworkable, because of the varying length of the table. Ultimately, I plan to build a PA Flow that reads the data into an Excel file for later analysis. (I've done this once with the Invoice Model. My new model isn't reliable enough at the moment to proceed.
 
Thanks in advance for your help.
 
Wannabe AI Fanboi
Categories:
I have the same question (0)
  • yashag2255 Profile Picture
    24,769 Super User 2024 Season 1 on at
     
    Can you confirm if the layout of different pages on the same document is exactly the same? If not, you will have to train the model with atleast 5 different files that have the same layout. Also, are you custom mapping individual cells in the table or are you using the row and column differentiators that allow you to directly mark the columns and rows? This may be another reason your training output is not showing up as expected. 
     
    With custom models, although it says atleast 5, based on my experience having 8-10 documents for training will yield better results. 
     
     
    Hope this Helps!
     
    If this reply has answered your question or solved your issue, please mark this question as answered. Answered questions helps users in the future who may have the same issue or question quickly find a resolution via search. If you liked my response, please consider giving it a thumbs up. THANKS!
  • JF-14022317-0 Profile Picture
    12 on at
    Hello @yashaq2255!
     
    Thanks for replying! I have a total of 11 documents. They have very similar layouts, but not exactly the same, perhaps 95%. They are actual invoices from two different months, so there's a natural differences within a range that the model would encounter over time.
     
    I used the row and column differentiators, not the custom mapping. The first page of the invoice has 12 rows and 15 columns. The following pages of varying numbers of pages have 15 rows and 15 columns. The final page can have anywhere from one row, 15 columns, to 15 rows and 15 columns. Again, the number of rows can vary by month, but the number of columns is always the same.
     
    What do you think?
  • yashag2255 Profile Picture
    24,769 Super User 2024 Season 1 on at
     
    If the number of columns are fixed on each page then that should be enough. However, when you upload an invoice that has more than one page, you can mark the checkbox that shows up when you identify a table on the first page that says table continues on next page. Have you tried that? 
     
    I have created a custom model to upload documents with multiple pages and it works as expected until 3-4 pages and then there are blanks or improper data extraction after 5 pages. Maybe the model needs more training or perhaps it is a limitation but there is no documentation for this. 
     
    Hope this Helps!
     
    If this reply has answered your question or solved your issue, please mark this question as answered. Answered questions helps users in the future who may have the same issue or question quickly find a resolution via search. If you liked my response, please consider giving it a thumbs up. THANKS!
     
  • JF-14022317-0 Profile Picture
    12 on at
     
    I have used the "table continues on next page" option, but the issue persists. I'm hoping more training data will resolve the issue. The invoices only come in once a month, so I'll have to be patient!
     
    Thanks!

Under review

Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.

Helpful resources

Quick Links

Forum hierarchy changes are complete!

In our never-ending quest to improve we are simplifying the forum hierarchy…

Ajay Kumar Gannamaneni – Community Spotlight

We are honored to recognize Ajay Kumar Gannamaneni as our Community Spotlight for December…

Leaderboard > Power Automate

#1
Michael E. Gernaey Profile Picture

Michael E. Gernaey 519 Super User 2025 Season 2

#2
Tomac Profile Picture

Tomac 296 Moderator

#3
abm abm Profile Picture

abm abm 232 Most Valuable Professional

Last 30 days Overall leaderboard