Copilot Studio

Copilot responses from PDF files

(2) Share

Report

Posted on by MT-22090639-0

we have uploaded PDF files to sharepoint as data source

is there any limitation in reading text from those pdf's/images are also embedded within PDF

Categories:

General topics

I have the same question (0)

All responses (3)

Answers (0)

Sort by

Mahesh Chintha 158 on at

Like (1)

Report
Copy link

Link copied!

We have seen GPT behind Copilot Studio was able to read images and company logos and does the OCR on high quality images, but we see the OCR is degraded from last week.

I recommend uploading documents with high quality images and test the current version.

Was this reply helpful? Yes No
Suggested answer

SaiRT14 1,992 Super User 2025 Season 1 on at

Like (0)

Report
Copy link

Link copied!
Pls try the following:

Power Apps and Power Automate have limitations when it comes to extracting text from PDF files, especially when the PDF contains embedded images or non-selectable text.

Native Power Automate PDF actions do not directly support extracting text from PDF files that contain images or scanned documents.

If your PDFs contain embedded images or are scanned documents (e.g., the text is part of the image), extracting text will not work unless Optical Character Recognition (OCR) is used.

OCR is required to extract text from image-based PDFs or PDFs with embedded images. Power Automate doesn’t have a built-in OCR feature, but you can integrate with services like AI Builder or third-party OCR tools.

AI Builder (a part of the Microsoft Power Platform) can be used to extract text from PDFs, including handling image-based PDFs via OCR.

SharePoint and Power Automate may encounter issues with large PDF files or PDFs with highly complex formatting. Processing times may increase, or extraction may fail for large documents.

let me know if you need more details.

Was this reply helpful? Yes No
Suggested answer

Vinoth Selvam 1,592 Super User 2025 Season 1 on at

Like (0)

Report
Copy link

Link copied!

Hi MT-22090639-0 ,

Currently Copilot should be able to read the text present in the PDF documents without any issues. We just need to make sure that PDF file is properly formatted.

Regarding embedded images with PDF, currently copilot cannot process this. But there is announcement from Microsoft that this feature will be out soon, Copilot will soon be able to process the Embedded images inside PDF also.

But for now, you can check these possibilities,

https://techcommunity.microsoft.com/t5/startups-at-microsoft/rag-on-pdf-with-text-and-embedded-images-with-citations/ba-p/4208617

Thanks.

Was this reply helpful? Yes No