Hello everyone,
I have been working on building a custom model in Ai builder to be used in cloud Power Automate flow.
The pdf files I would be extracting data from are generated pdf's and not scans.
Under Desktop Power Automate there is functionality "Extract text from pdf".
The functionality it offers is more than enough for my application.
My question is, is or will this feature be available for cloud power automate.
Currently I have to use the Ai builder solution for automated cloud flows with all the extra manual tagging and training of the model that has to be done. Using third party connectors is not and will not be allowed by our DLP policy, so that is a no go sadly.
Thank you in advance.
Lex
How about using an Adobe connector?
You can get the content as Json and parse to extract the appropriate data.
Thinking about it a bit more, if you have the ability to provision Azure resources, you could also give Form Recognizer's Layout a try. It's the foundational service we use for document processing in AI Builder. Layout should provide the data you're looking for.
Layouts - Form Recognizer - Azure Applied AI Services | Microsoft Docs
Understood. It'd be worth logging that as an idea on https://aka.ms/aibuilder-ideas to help us prioritize as an improvement for future changes. Right now, unfortunately, the only way to get table and checkbox data is with a custom trained model :'(
Understood. It'd be worth logging that as an idea on https://aka.ms/aibuilder-ideas to help us prioritize as an improvement for future changes. Right now, unfortunately, the only way to get table and checkbox data is with a custom trained model :'(
When I tried it with Power Automate Desktop the extracted text, including tables and checkboxes (the X marking the checkbox to be precise), was fairly consistent. It would be fairly trivial for me to write a script or even a flow to search for the keywords and parse the text.
The extracted data would be saved to a dataverse table.
Thanks for your time.
Lex
It will extract text from everywhere. It won't extract structure like tables or checkboxes state though. Is that something you'd find valuable without actually getting more value from a trained model? In a sense I'm curious what you do with the data once you've extracted it.
No I have not tried it yet. From the description on the ai builder explore page the extract from documents seemed the most appropriate.
Hi Lex, That's a great question. Have you tried using "Text Recognition" in AI Builder? That provides a pre-built model to extract text from any type of PDF document. You can learn more about the feature here: Recognize text with AI Builder - Learn | Microsoft Docs . Do let us know if this work for you or if I misunderstood your question