Notifications
Announcements
Dear Team,
I'm a recent addition to our group.I'm actively involved in a project focused on OCR solutions, specifically utilizing Azure AI Document Intelligence (formerly known as Azure Form Recognizer). Our approach involves employing a custom model based on templates to process invoices. However, training this custom model for various invoice templates is a time-consuming process, taking several hours for each template. When dealing with a substantial number, such as 300 different templates, the associated costs and efforts become quite significant.
We have explored the possibility of using prebuilt models but encountered a challenge with the "Invoice" prebuilt model. It generates distinct "Key-value" pairs for each unique "Invoice" template, necessitating additional efforts to tailor the processing according to our specific requirements.
I'm inquiring if there are any open-source, non-template-based OCR APIs available for invoices in the market. Such a solution would ideally provide output in JSON format that can be readily processed.Any assistance or guidance on this matter would be greatly appreciated.
Regards
@AnuR1
I’ve recently put together such a template that OCRs any given PDF with any format, creates a replica text of the document, & passes that replica text on to Azure’s GPT for data extraction to JSON. And making adjustments to the extract is as easy as typing extra details into the default prompt.
https://powerusers.microsoft.com/t5/Power-Automate-Cookbook/Extract-Data-From-PDFs-and-Images-With-GPT/td-p/2201345
It will have trouble & likely fail if you have more than 3-4 pages due to prompt text limits, but you can select which page numbers you want it to work on if only a few are relevant.
Also it only uses less expensive AI Builder OCR services, so it takes half the AI Builder credits of the other form/document tools.
Thank you Tkoloa
Under review
Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.
In our never-ending quest to improve we are simplifying the forum hierarchy…
We are honored to recognize Ajay Kumar Gannamaneni as our Community Spotlight for December…
These are the community rock stars!
Stay up to date on forum activity by subscribing.
Michael E. Gernaey 538 Super User 2025 Season 2
Tomac 405 Moderator
abm abm 252 Most Valuable Professional