
Announcements
Hello,
I'm currently working on automating invoice porcessing using an Ai prebuilt model.
While it genrally works well with most invoices, i've noticed ocasional issues with detecting all items table properly.
I can understand the importance of the invoices quality or certain formats can be a challenge for the AI Models as it can confuse the prebuilt model, but it can correctly recognize most of the items within the invoices, but occasionally it just skips some of them or the data detection is wrong.
Sometimes, reprocesing the invoice can fix the issue, but thats not the ideal solution.
I've also tried to train a model, but i have the same result with some invoices.
I would appreciate any insights or suggestions on how to improve the accuracy of the AI model for invoice processing.
the is an example of the data recognition issue.
the rows in red are incorrect
Description Quantity Unit price Amount
| |AGUACATE KG | 2.100 | 5.50 | | 11.55 | |
| |AJO TRENZA | 10 | 2.00| | 20.00 | |
| | ALBAHACA BANDEJAS | 10 | 1.25 | | 12.50 | |
| | APIO KILO | 10 | 2.901 | 29.00 | |
| |CEBOLLINA MAZO | 1 | 3.50| | 3.50 | |
| LECHUGA GREEN LEAF KILO | 6| | 6.50| | 39.00| |
| ROMANA KILO | 6|LECHUGA | 6.00| | 36.00| |
| |LIMON AMARILLO UND | 10 | 0.75| | 7.50 | |
| |NARANJA IMP. UNIDAD | 5 | 0.91| | 4.55 | |
| |PAPA KG | 25.200 | 3.15| | 79.38| |
| |REPOLLO VERDE KILO | 6.700 | 3.05| | 20.44| |
| PAQ. | 2|RABANO | 3.75| | 7.50 | |
| | TOMATE 3 X 3 KILO | 25 | 4.70| | 117.50 | |
| | TOMATE CHERRY BANDEJA | 10 | 2.00 | 20.00 | |
| |ZANAHORIA KILO | 10 | 2.35| | 23.50| |
Thank you!
Hi @YanysM ,
Thanks for bringing that to our attention.
This seems to be caused by a misdetection of the words by the OCR service. It may be improved in the future as Microsoft is regularly improving its OCR capability but it can't be improved right now by AI Builder model training.
If you are running a Power Automate flow to extract the information, I would recommend to include a cleaning step after data extraction to handle such case. For example, if you detect "|" or even letters in the quantity field, you could assume that a part of the string is not in the correct place. You could therefore add a logic that turns
| ROMANA KILO | 6|LECHUGA |
to
| LECHUGA ROMANA KILO | 6 |
Hope that could help!