web
You’re offline. This is a read only version of the page.
close
Skip to main content

Announcements

News and Announcements icon
Community site session details

Community site session details

Session Id :
Power Platform Community / Forums / Copilot Studio / SharePoint Folder Know...
Copilot Studio
Suggested Answer

SharePoint Folder Knowledge source (Unstructured Knowledge source) is unable to index data

(2) ShareShare
ReportReport
Posted on by 4
We have an Agent which is connected to SharePoint Folder as a Knowledge source (Unstructured Knowledge Source) from Copilot Studio Portal. The folders consist of excel files which has our business data. As per the Microsoft the SharePoint Folder supports 1000 files, 50 sub folders, 10 level of sub folders. It supports files like .doc, .docx, .pdf, .xls, .xlsx etc and supports file size upto 512MB for each file.
 
For the agent to search through the real time data, we are replacing the existing excel file until it reached the 512 MB and once the file size reaches the limit we will be uploading a new file in to the SharePoint folder. Right now, we have added 10 SharePoint individual folders as knowledge source to the Copilot Studio Agent.
 
The SharePoint data will be indexed into Dataverse and then the Agent will be able to answer to user queries. As per the Microsoft, for synchronization there is a scheduled job which will run to index the data in Dataverse, this process can take up to 6-8 hours approximately.
 
We are observing the agent is not completing the index and it is unable to answer to the user queries. For one of the specific files, the agent is unable to answer to any one of user queries. This specific file has the business data and the file is not applied with any sensitive label, not password protected.
 
Does anyone has faced with similar issues? Is there any way to verify if the data has completed the index?
Agent Response.png
I have the same question (0)
  • Suggested answer
    11manish Profile Picture
    3,333 on at
    Most likely, that specific file is in a stale or failed indexing state, often caused by repeated overwrites or complexity in the Excel structure.
     
    Recommended fix:
    • Remove and re-add the SharePoint source or affected file.
    • Upload a fresh, renamed version instead of overwriting.
    • Wait for full sync (6–12+ hours).
    • Validate with specific test queries.
  • Suggested answer
    Valantis Profile Picture
    6,735 on at
     
    A few confirmed things from Microsoft docs that directly apply to your situation.

    How to verify index status: go to Copilot Studio > your agent > Knowledge page. Each knowledge source shows a status column. Look for Ready, Indexing, or Failed. If a specific file or folder shows anything other than Ready, that's your confirmation it hasn't completed indexing. Microsoft docs confirm: "Once the status of your items is set to Ready, you can ask your agent questions."

    For the specific file that isn't working: the most common cause for a single file failing silently in an otherwise working knowledge source is the content structure inside the Excel file.
     
    Microsoft's ingestion process chunks and vector-indexes the content for semantic matching. Excel files with merged cells, complex formatting, pivot tables, or data structured as raw numbers without text context often extract poorly and produce empty or unusable chunks, so the file shows as Ready but the agent gets nothing meaningful from it. Try creating a clean flat table version of that specific file (no merged cells, plain column headers, text values where possible) and test with that.

    For your overwrite approach (replacing the same file repeatedly): Microsoft docs warn that the sync job runs every 6-8 hours and overwriting a file can cause a stale or failed indexing state. Instead of overwriting, upload a renamed new file and remove the old one. This forces a fresh ingestion rather than a delta sync that may fail.

    Also confirmed: you're using 10 SharePoint folders as knowledge sources. Microsoft docs state agents can use only 5 different source types at a time, but this refers to source types (SharePoint, Dataverse, OneDrive etc.) not the number of folders within one type. 10 SharePoint folders should be fine as they count as one source type.
     
     

     

    Best regards,

    Valantis

     

    ✅ If this helped solve your issue, please Accept as Solution so others can find it quickly.

    ❤️ If it didn’t fully solve it but was still useful, please click “Yes” on “Was this reply helpful?” or leave a Like :).

    🏷️ For follow-ups  @Valantis.

    📝 https://valantisond365.com/

    💼 LinkedIn

    ▶️ YouTube

  • Suggested answer
    Haque Profile Picture
    3,653 on at
    Hi @CU21050757-0,
     
    I have two observations:
     
    First: In a different way, even if the file is not protected - complex formatting, very large tables or unsupported excel feature like macros, external links can cause the indexing job to fail or stall.

    Second: Files close to the 512 MB limit or with very large datasets (means close to 512 MB is very large file for sure)  may take longer or fail to index completely. Splitting the file into smaller parts can help like 200MB or around 300MB.

    Note: Sycning with SharePoint and Dataverse with large files is sometimes tedious. The job may encounter errors or timeouts. Can you please check platform logs or telemetry for errors?

     

    I am sure some clues I tried to give. If these clues help to resolve the issue brought you by here, please don't forget to check the box Does this answer your question? At the same time, I am pretty sure you have liked the response!
  • CU21050757-0 Profile Picture
    4 on at
     
    Thanks for your suggestions.
     
    We want the Agent to refer to the updated data while answering to user queries.  Can you suggest the best way to update the excel file with the updated data.
     
    Right now we are replacing the existing file until it reaches 512 MB. Once the file reached 512 MB then we will be uploading a new file to the folder. This way we will be within the limits if number of files.
     
    FYI.
     
    "We have tested the replace functionality with the Agent using a test data. When the file is replaced, the Agent was able to answer to old data which is indexed and agent needs to wait for the new data to get indexed."
     
    Any other option for updating the data within the file will be helpful.
  • Suggested answer
    Haque Profile Picture
    3,653 on at

    Hi @CU21050757-0,

    Your current approach—replacing the existing Excel file with updated data until it reaches the 512 MB limit, then uploading a new file to the SharePoint folder to stay within file count limits—is a practical strategy for sure.

    If our target is to make sure the agent always refers to the most updated data, let's follow these steps:

     

    File Replacement and Naming: When replacing the existing file, please keep the file name consistent so the knowledge source reference remains unchanged. For new files after reaching size limits, use a clear naming convention with timestamps or version numbers to track files. For example, each time you update the data, you replace this file with the new Excel file but keep the exact same file name (BusinessData_Current.xlsx). Also, when the file size approaches the 512 MB limit, let's upload a new file with a new name, for example: BusinessData_2024Q2.xlsx. Let's keep the old file (BusinessData_Current.xlsx) until the new file is fully indexed and ready.

    Knowledge Source Configuration: If the agent uses a SharePoint Folder knowledge source, ensure it is configured to include all relevant files in the folder. When adding a new file, the folder-based knowledge source will index all files, so the agent can access data from multiple files seamlessly.

    Indexing and Synchronization: Please remember that SharePoint Folder knowledge sources sync data to Dataverse on a scheduled job that can take several hours (6-8 hours). Need  plan updates accordingly and allow time for indexing to complete before expecting the agent to reflect new data.

    Splitting Large Files: Alreayd  mentioned above, but again, if a single Excel file approaches the 512 MB limit, consider splitting it logically into smaller files by date ranges, departments, or categories. This improves indexing performance and reduces risk of sync failures.

    Automate File Management: Let's use Power Automate flows to manage file uploads, replacements, and folder organization to maintain consistency and reduce manual errors. 

    Agent Refresh: After updating files, republish or refresh the agent if needed to ensure it picks up the latest indexed data.

     


    I am sure some clues I tried to give. If these clues help to resolve the issue brought you by here, please don't forget to check the box Does this answer your question? At the same time, I am pretty sure you have liked the response!

     

     

     

     

     

Under review

Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.

Helpful resources

Quick Links

Season of Sharing Community Challenge Launch!

Jump in, show your community spirit, and win prizes!

Kudos to our 2025 Community Spotlight Honorees

Expanding mentorship, skilling, and AI innovation

Congratulations to the May Top 10 Community Leaders!

These are the community rock stars!

Leaderboard > Copilot Studio

#1
Valantis Profile Picture

Valantis 277

#2
11manish Profile Picture

11manish 206

#3
sannavajjala87 Profile Picture

sannavajjala87 156 Super User 2026 Season 1

Last 30 days Overall leaderboard