Hi.
We are doing POC for customer on what Form Processing can do.
At the moment their sample documents contain a lot of text and tables (some spanning multiple pages as well).
Texts that we want to extract also don't really sit on the same location between documents.
I know we can create document sets for different layouts, but in our case, they probably are too different with each other in terms of the content and tables.
How does AI Form Processing recognise tags and tables?
Is it based on the location of the text/tables? Do documents need to have relatively similar locations of text/ tables and order?
For tables, do they need to have same headers? Can it recognise table based on tag outside the table e.g. Table title or caption?
Thank you.
I just couldn't find the exact documentation on how AI builder form processing works behind the scene.
If your documents don't differ too much with similar items location and table headers, they could be trained in the same collection.
There is some flexibility in the way the model will locate tables and their data to some degree.
Overwise, you probably have to build several collections trying to regroup documents which are somehow consistent.
To your question about tags outside the table it is difficult to answer without seeing samples of documents.
Best would be probably to try it out.
You can activate a one month trial to test the capabilities.