Is there any way by which we can extract the contents of pdf , send that to third party for processing and then add that extracted sensitive information back is any way in power automate ?
Encodian has actions that allow you to extract text from pdf files:
These actions require an API key and a subscription, so they are not free to use. The above extract text action requires that the pdf not be image based. If the pdf is image based, Encodian has an action to perform OCR on the file. For the process illustrated above, I am using one of their AI actions to generate a summary of the document. The extract PDF metadata action is needed to get the number of pages within the pdf, which as you see above is used for setting the end page.
The extracted text will have a lot of line breaks and white space in it, so it will not be pretty. It ignores just about all formatting. But for what I need, AI to generate a summary, AI doesn't care about the extra white space or line breaks and ignores them. The process above uses roughly four credits per document.
I have not used this action, but they have an action to replace text in a pdf:
The subject of your post contains masking (or redacting content), but the body of your post does not mention that you need to do this. If you do need to redact content, Encodian has a redact action as well.
Was this reply helpful?YesNo
Under review
Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.