
Hello everyone,
I have a question regarding the use of AI Builder in relation to extracting specific scanned pages from PDF documents. I've come across various resources online that discuss using AI Builder to extract data from PDFs, but I'm interested in a slightly different task.
I have a collection of PDF documents containing scanned pages, and I'm wondering if it's possible to leverage AI Builder to create a model that can identify specific pages with a consistent layout and export these pages as individual PDF files. These pages share the same layout, making them distinguishable from the rest of the document.
Has anyone attempted a similar task using AI Builder or a similar tool? If so, could you provide some guidance on how to set up such a model or any alternative approaches that might achieve this task?
Thank you in advance for your insights and suggestions! Your help would be greatly appreciated.
Hi @apara ,
You can extract all the text from a document using AI Builder Text recognizer and identify specific pages using text search for example. But you you can't create PDF files with AI Builder. See a recent thread in which some workarounds are discussed: Auto Redact PDF Documents - Power Platform Community (microsoft.com)