Answered

Copilot Studio returns correct answer on first query but fails when same question is repeated

(1) Share

Report

Posted on by CT-20042235-0

Hi everyone,

I’m seeing inconsistent behavior in Copilot Studio and wanted to check if this is expected or a configuration issue.

Scenario:

I ask a question (e.g., "My P41 is failing and not detected in BIOS")
The agent successfully retrieves and returns a grounded answer from my knowledge source (Solidigm website)
I immediately repeat the exact same question in the same session

Observed behavior:

Instead of returning the same answer, the agent responds with the message from my fallback topic "I wasn’t able to find an answer on the Solidigm web site..."

Notes:

The agent is configured to only answer based on approved knowledge sources (website grounding enabled)
This is happening within the same chat session (no restart)
No changes to the knowledge source between attempts
The first response is correct and fully grounded

Questions:

Is this expected behavior due to how Copilot Studio handles conversation context between turns?
Is there a recommended way to force consistent retrieval for identical queries in the same session?

Any guidance or similar experiences would be really helpful.

Thanks!

Screenshot 2026-0...

Your file is currently under scan for potential threats. Please wait while we review it for any viruses or malicious content.

Categories:

General topics

I have the same question (0)

All responses (7)

Answers (1)

Sort by

Verified answer

Valantis 6,735 on at

Like
a
(3)

Report
Copy link

Link copied!

Hi @CT-20042235-0,

This is confirmed expected behavior, not a bug. Microsoft's own FAQ for generative answers explicitly states:

"The question-answering process includes previous chat context, so that repeating a query within a chat session almost guarantees different answers."

The reason: when you ask the same question a second time, the agent now has the first Q&A pair in its conversation context. The generative orchestration interprets the repeated question differently since it sees the previous answer already given. This can cause the retrieval to behave differently, sometimes skipping the knowledge source entirely because the model infers the topic was already addressed.

For your scenario (website-grounded agent with fallback topic): the second query is likely being interpreted as a follow-up or confirmation rather than a fresh knowledge retrieval request, which doesn't trigger the generative answers node and falls through to the fallback topic.

Workarounds:

1. Add explicit agent instructions: "Always search the knowledge source for every user question, regardless of whether a similar question was asked earlier in the conversation."

2. Shorten the conversation context window if that's configurable in your agent's settings to reduce how much prior context influences retrieval decisions.

3. For regression testing: Microsoft explicitly warns that repeating fixed queries is likely to result in varying answers and false positive error reports. Use new sessions for each test, not repeated questions in the same session.

Ref: https://learn.microsoft.com/en-us/microsoft-copilot-studio/faqs-generative-answers

Best regards,

Valantis

✅ If this helped solve your issue, please Accept as Solution so others can find it quickly.

❤️ If it didn’t fully solve it but was still useful, please click “Yes” on “Was this reply helpful?” or leave a Like :).

🏷️ For follow-ups @Valantis.

📝 https://valantisond365.com/

💼 LinkedIn

▶️ YouTube

Was this reply helpful? Yes No
Nicer_Atl 174 on at

Like
a
(2)

Report
Copy link

Link copied!

@ CT-20042235-0 hate to tell you this, but I suspect many have the same challenge.

This bit: "The generative orchestration interprets the repeated question differently" says a lot about how the GPT actually works. It is even much deeper than what is explained, even though @Valantis did a fantastic job explaining. There is also this little thing called "intent" which the GPT seems to give priority over everything else. In other words; you can say what you want, I'm going to tell you what I think you need.... LoL.

What you are after is a very "deterministic search" which is quite contrary to how the GPT works as @Valantis explained. I have built complete Power Automate deterministic search flows and attached them as tools to child-agents, not to mention used practically all 8000 characters in the Agent Instructions to try to control the response. Even worse, if you publish the Agent to something like Teams, it may honor your wishes even less as there are different memory mechanics there. You can do little things like starting a new chat in Teams or in Copilot Studio Test to flush the memory, or force a "Start Over" Topic to run, but even then. As an example, in Copilot test I asked the Agent how many configurations of a system require a certain component which is clearly spelled out and marked down in a RAG Knowledge source and; the response was incorrect. Started a new chat and asked the exact same question and got the correct answer. Asked a 3rd time without starting a new session, got the same result you did @ CT-20042235-0. Asked a 4th time after restarting the session and got an incorrect response. Interestingly you can see it fetch the RAG source document in Copilot Studio test, but still responds incorrectly.....

I'm not sure how some are getting really reliable results for this type of application where responses need to be dead accurate, but wishing to hear from others more knowledgeable than me.

Was this reply helpful? Yes No
Suggested answer

11manish 3,333 on at

Like
a
(1)

Report
Copy link

Link copied!
Based on your description, the most likely explanation is conversation context affecting retrieval combined with confidence-based grounding validation. The first query retrieves the correct Solidigm content, while the second query is evaluated in the context of the prior turn and fails the grounding threshold, causing the fallback topic to trigger.

I would start by reviewing the conversation transcript and the knowledge retrieval diagnostics for both turns. Those logs usually reveal whether the second request:

Retrieved no documents,

Retrieved documents with low confidence, or

Never executed a retrieval step because a topic/fallback path was selected instead.

Was this reply helpful? Yes No
Nicer_Atl 174 on at

Like
a
(1)

Report
Copy link

Link copied!

@11manish either way, doesn't "conversation context affecting retrieval combined with confidence-based grounding validation" mean that the intent determination is being influenced by the conversation and without the right intent to begin with, it will go down the wrong path.... i.e. fallback?

I can see from the test panel in Copilot Studio (where rationale is also shown) if that is what you are referring to as the logs, that it doesn't even reason on the 2nd turn. But starting a new test session always makes it start reasoning without influence.

So for the intention of a deterministic search, the challenge becomes how to gracefully make the GPT always start a fresh session, or forget previous context. Fairly easy but kludgy in Copilot Studio Test Panel, but in ??Teams?? different story it seems.

Was this reply helpful? Yes No
CT-20042235-0 73 on at

Like
a
(1)

Report
Copy link

Link copied!

Thanks everyone for the thoughtful responses.

I updated the agent instructions per Valantis's recommendation, but in testing the agent still routes repeated/rephrased questions to fallback rather than responding from the retrieved content. Seems like something we may have to live with unless anyone has other suggestions.

One clarification on our side: we do want prior conversation context to still be considered (for natural follow-ups), just not to the point where the agent treats a repeated question as already answered and skips responding from the new retrieval.

Was this reply helpful? Yes No
Suggested answer

Valantis 6,735 on at

Like
a
(1)

Report
Copy link

Link copied!

Hi @CT-20042235-0,

The problem you're describing is the fundamental tension in generative agents: context awareness for follow-ups vs. treating repeated questions as fresh retrieval triggers. The instruction approach helps but doesn't fully override the model's behavior when it sees a semantically identical prior turn.

few more approaches to try:

1. Refine the instruction to be more explicit about repetition: instead of "always search for every question", try: "If a user asks a question that appears similar to a previous question in this conversation, treat it as a new question requiring a fresh search. Never assume the previous answer applies."

2. Add a Fallback topic override: in your fallback topic ("I wasn't able to find an answer..."), add a trigger that detects when the fallback fires and immediately calls the generative answers node again with the original user utterance as input. This creates a retry loop that catches the cases where the model skipped retrieval.

3. Use a Start Over topic that users can trigger: "new question" or "search again" phrases could redirect the conversation to a clean context state without ending the session. This gives users a way to reset context for repeated queries.

4. This is a confirmed platform limitation. For scenarios where dead-accurate deterministic retrieval is required regardless of conversation history, the most reliable architecture is a Power Automate flow as a tool that performs the vector search/retrieval directly (bypassing the generative orchestration), then returns the result. The agent calls the tool on every question, the tool doesn't have conversation context, and the response is always grounded. More setup but more reliable.

Ref: https://learn.microsoft.com/en-us/microsoft-copilot-studio/faqs-generative-answers

Best regards,

Valantis

✅ If this helped solve your issue, please Accept as Solution so others can find it quickly.

❤️ If it didn’t fully solve it but was still useful, please click “Yes” on “Was this reply helpful?” or leave a Like :).

🏷️ For follow-ups @Valantis.

📝 https://valantisond365.com/

💼 LinkedIn

▶️ YouTube

Was this reply helpful? Yes No
Nicer_Atl 174 on at

Like
a
(0)

Report
Copy link

Link copied!

Suggestion #4 from @CT-20042235-0 will work fairly well but, not as well if its part of an Agentic solution, since the Orchestrator Agent still has to determine intent. In my experience, instructing the Orchestrator (using a keyword in the Orchestrator instructions) to route intent to a child-agent, which in turn can invoke the Power Automate (PA) Search tool, is still tricky. So yes, if you are past the intent stage, the PA tool seems to be a decent way to execute the search.

Was this reply helpful? Yes No