Power Platform Community Forum Thread Details

Problem
When adding a SharePoint folder as a Knowledge source in Copilot Studio, only text that is embedded in files (Word/Excel/TXT/text‑PDF) is indexed.
OCR results that SharePoint stores in metadata columns (e.g., “Extracted text”) are not read or indexed.
As a result, scanned PDFs and images remain invisible to the agent unless we build additional automation to export OCR text into separate .txt/.docx files and upload them.
Proposal
Extend the SharePoint/OneDrive connector for Copilot Studio so it can ingest and index OCR‑extracted text stored in SharePoint metadata columns, in addition to file body content.
Provide an option to select which columns to index, and ensure security trimming respects the file’s permissions.

Categories:

Copilot Studio skills development

Hi There!

You are correct. When you add knowledge directly to Copilot Studio, the platform currently relies on semantic search only. This means it does not implement a full multimodal RAG (Retrieval-Augmented Generation) approach out of the box.

If your scenario requires more advanced capabilities—such as combining text, images, or other data types, or having more control over indexing and retrieval—you would need to use Azure AI Search. With Azure AI Search, you can build a multimodal RAG solution and then connect it to your Copilot as an external knowledge source.

Regarding feature requests, this forum is mainly intended for discussion and support, not for submitting product ideas. If you’d like to propose an enhancement or new capability, the best place to do so is the Power Platform Ideas portal. There, other community members can vote on your idea, and Microsoft regularly reviews

these suggestions when planning future product updates: https://ideas.powervirtualagents.com/d365community/forum/21a4f1f9-f7fc-ec11-82e6-000d3a8b109b