web
You’re offline. This is a read only version of the page.
close
Skip to main content

Notifications

Announcements

Community site session details

Community site session details

Session Id :
Power Platform Community / Forums / Copilot Studio / SharePoint Knowledge S...
Copilot Studio
Unanswered

SharePoint Knowledge Source gives hallucinations? Looking for advice

(3) ShareShare
ReportReport
Posted on by 39

Hey all, I'm using Copilot Studio and have connected a SharePoint site as a knowledge source. The site hosts around 1800 PDF documents, and I've added the entire site as a single knowledge source for my agent.


The problem is that the agent frequently gives factually incorrect answers, even when the correct information clearly exists in the documents; it often "hallucinates" and/or mixes up details. I'm assuming it's trying to parse too much unstructured data at once.


What I did:

  • turned on orchestration (alongside generative AI) and turned off "Allow the AI to use its own general knowledge"

  • turned on enhanced search results (yes, I do have Microsoft 365 Copilot license in the same tenant as the agent)

  • added the content moderation to be medium (maybe should try with high as well now)


Possible next steps:

  • I saw in some documentation that SharePoint has a limit of a total of 200 files which can be included for each source, so maybe I should break my knowledge sources into 9 separate sources containing 200 PDF documents each. (docs: Unstructured data as a knowledge source - Microsoft Copilot Studio | Microsoft Learn). But, I wonder if this is relevant in my case since my understanding is that there are two types of SharePoint knowledge sources (SP URL, and direct documents from SP).

Wondered if anyone has any tips on what is the best way to move forward? I understand that most likely I will need to enhance the description of the knowledge source and maybe clean the source some more (some folders are unnecessary and can be removed). Would it help if I added some sort of semantic mapping in the general instructions? What I mean is that I explain where to look for (relative path) in the knowledge source in case a certain question is being asked.


Any tips or opinions I would greatly appreciate. :)

Categories:
I have the same question (0)
  • ronaldwalcott Profile Picture
    3,847 Super User 2025 Season 2 on at
    What type of queries are generating hallucinations? Is it possible that the data contains conflicting information? Could also be caused language phrasing.
  • Jcook Profile Picture
    7,781 Most Valuable Professional on at
    Hi,
     
    you can try a few things:
    1. I would try with Enhanced Search turned off.
     
    2. Try adding retrieval hints in the agents instructions. 
    example:
    - For payroll-holiday questions, prefer files in /HR/Benefits/Holidays.
     
  • MM-07030830-0 Profile Picture
    39 on at
    @Jcook, any explanation as to why I should turn off Enhanced Search? The second suggestion I am already implementing at the moment of writing - will let you know after testing in a couple of days.
     
    @ronaldwalcott, the problem comes from general questions on the knowledge. The answers are present in the documents and there are no conflicting information (tough to be 100% sure because of the big count of files, but I analyzed a lot of the documentation and I haven't found any conflicting information).
  • sabe Profile Picture
    3 on at
    I have the same situation, a large SP doc library with company policies, procedures, and I cannot get precise answer consistently regarding how to create an out of office message. In the knowledge document, there is exact text to be included in the ooo message, but the agent is unable to quote it word by word. It misses and replaces words, phrases.
    Even worse when I turn on Enhanced Search, so I suggest to turn it off. 
  • SB-26110046-0 Profile Picture
    3 on at
    My agent is generating responses using its own general knowledge with GenAI Orchestration Enabled and General Knowledge Disabled.
     
    I have a single knowledge source which is a word document uploaded directly to the agent. I ask a question that has no answer in the document, and it generates an answer anyway. 
     
    GenAI Orchestration seems broken when it comes to knowledge.
  • Verified answer
    MM-07030830-0 Profile Picture
    39 on at
    What gave me better results in the end was a lot of data cleaning (removing duplicate files in knowledge source) and breaking each folder into a separate SharePoint knowledge source.

    Additionally, ingesting files directly as unstructured data gave me way better responses.
     
    Hopefully some might find this useful :)

Under review

Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.

Helpful resources

Quick Links

Forum hierarchy changes are complete!

In our never-ending quest to improve we are simplifying the forum hierarchy…

Ajay Kumar Gannamaneni – Community Spotlight

We are honored to recognize Ajay Kumar Gannamaneni as our Community Spotlight for December…

Leaderboard > Copilot Studio

#1
Michael E. Gernaey Profile Picture

Michael E. Gernaey 265 Super User 2025 Season 2

#2
Romain The Low-Code Bearded Bear Profile Picture

Romain The Low-Code... 240 Super User 2025 Season 2

#3
S-Venkadesh Profile Picture

S-Venkadesh 101 Moderator

Last 30 days Overall leaderboard