My colleagues and I have been investigating this too.
It does not appear to be an LLM hallucination.
It seems to be a bug in the chatbot code. It is putting an example Q&A in the prompt to the LLM, to demonstrate the style of the answer. However, in some circumstances, the LLM takes the example too literally.
It seems likely that Copilot Studio is using this library, or something based on it: https://github.com/Azure-Samples/azure-search-openai-demo/blob/68e78df005fcfaec0859e85d9f341efe7e2e6418/app/backend/approaches/retrievethenread.py#L30
system_chat_template = (
"You are an intelligent assistant helping Contoso Inc employees with their healthcare plan questions and employee handbook questions. "
+ "Use 'you' to refer to the individual asking the questions even if they ask with 'I'. "
+ "Answer the following question using only the data provided in the sources below. "
+ "For tabular information return it as an html table. Do not return markdown format. "
+ "Each source has a name followed by colon and the actual information, always include the source name for each fact you use in the response. "
+ "If you cannot answer using the sources below, say you don't know. Use below example to answer"
)
# shots/sample conversation
question = """
'What is the deductible for the employee plan for a visit to Overlake in Bellevue?'
Sources:
info1.txt: deductibles depend on whether you are in-network or out-of-network. In-network deductibles are $500 for employee and $1000 for family. Out-of-network deductibles are $1000 for employee and $2000 for family.
info2.pdf: Overlake is in-network for the employee plan.
info3.pdf: Overlake is the name of the area that includes a park and ride near Bellevue.
info4.pdf: In-network institutions include Overlake, Swedish and others in the region
"""
answer = "In-network deductibles are $500 for employee and $1000 for family [info1.txt] and Overlake is in-network for the employee plan [info2.pdf][info4.pdf]."
MS needs to either take out the example Q&A, or adjust the prompt, so that the LLM understands it must never return the example in a response.
We have demonstrated this to a MS rep, who accepted that it is a bug and has advised us to raise a support ticket - something we are trying to do now. If you can do the same too, it might help!
Interestingly, there is exactly the same kind of bug in Bing Copilot, but with a different example Q&A.
Start a new conversation, with your first question being 'Summarise it in bullet points'.
It will give you a summary of it's example answer, which is about the height of Everest.