web
You’re offline. This is a read only version of the page.
close
Skip to main content

Announcements

News and Announcements icon
Community site session details

Community site session details

Session Id :
Copilot Studio
Suggested Answer

Inconsistant results

(2) ShareShare
ReportReport
Posted on by 12
Hello, 
 
I have a very simple agent that is to release sales order on the current date.
 
This is my instructions:
 
Find No_ where Status = 0 and Shipment Date is equal to current date
Set DocumentNo = No_
use
Release Sales Orders
Release Sales Orders
with DocumentNo
 
 
The first time I ran the agent it ran perfect, all 96 orders for the current date were release.  
 
I reset my database so all the order were open, the next time I ran the agent it could not find any records.
 
The third time I ran it the again it only process 10 orders correct.  I commented to the agent that I have 96 open orders and you only process 10, it then proceed to process 30 more orders. I kept commenting in the same conversation that had more orders to process.  it eventually got them all released.  some time, it would state it could not find any orders.
 
I reset my database again and have tried multiple different instructions and I cannot get the agent to be consistent.  
 
NOTE: I did see that there was a limit of 10 line per query on unpublished agents, but I find that this is not accurate either because if I delete the agent and recreate it and run it for the first time, it works correctly.  It is always the second plus queries that seem to go astray.  
 
Is the only way to truly test the agent is to publish it?
 
 
Categories:
I have the same question (0)
  • Suggested answer
    Vahid Ghafarpour Profile Picture
    817 on at
    I believe for business-critical processing like releasing many orders, Copilot Studio is not ideal as the orchestration layer.
    I suggest to move loop out of agent
    Agent > query > loop > release
  • Suggested answer
    Valantis Profile Picture
    5,267 on at
     
    The inconsistency has a specific cause. After the first run processes 96 orders, all those results stay in the conversation context. On the next run the model is working with a much larger context full of previous data, which confuses it when you give the same instruction again. That's why deleting and recreating the agent works -- you get a clean context every time.
     
    The 10-record limit in the test panel is a confirmed real limitation. Microsoft docs confirm the test panel limits query results for unpublished agents. Published agents don't have the same limit.
     
    Vahid's suggestion is the right fix. The agent should not be looping through records across multiple turns. The correct pattern:
     
    1. Agent calls a single Power Automate flow as a tool
    2. The flow queries all 96 orders and loops through them
    3. The flow returns a summary back to the agent
    4. Agent reports the result in one message
     
    This removes the context problem entirely because the agent makes one tool call and gets one result back.
     
    To answer your direct question: publishing will behave more consistently than the test panel, but the architecture problem will still cause inconsistency in production if the agent is looping across multiple turns.
     

     

    Best regards,

    Valantis

     

    ✅ If this helped solve your issue, please Accept as Solution so others can find it quickly.

    ❤️ If it didn’t fully solve it but was still useful, please click “Yes” on “Was this reply helpful?” or leave a Like :).

    🏷️ For follow-ups  @Valantis.

    📝 https://valantisond365.com/

    💼 LinkedIn

    ▶️ YouTube

  • Suggested answer
    Sayali Profile Picture
    Microsoft Employee on at
    Hello @JD-13041615-0,

    Your behavior is expected and not due to a simple “publish vs test” difference—it’s mainly due to how Copilot Studio agents handle data retrieval and reasoning.

    In Copilot Studio, agents are non‑deterministic and optimized for summarization, not bulk processing. They often return partial results (commonly ~10 items) even when more records exist, because the model limits how much data it processes at once and may only consider the first subset of results. This is why you see inconsistent behavior—sometimes 10, sometimes more after follow-ups, and sometimes none—because each run generates a new plan and may pick a different subset or stop early. Additionally, testing in the chat panel is conversational and stateful; the agent may reuse prior context or partially completed actions unless you reset the conversation, which further impacts consistency. 
    Reference-
    Test your agent - Microsoft Copilot Studio | Microsoft Learn

    So no—the only way to “properly test” is not just publishing. Publishing might improve stability slightly, but it won’t fix the core issue. The real solution is to avoid relying on the agent to process large datasets directly. Instead, you should use a structured action (like Power Automate or a connector query) that explicitly retrieves all records (with pagination if needed) and processes them deterministically, then let the agent call that action.

    👉 In short: your inconsistency is by design (AI + data limits), not a bug, and publishing alone won’t solve it—you need to move the data processing into an action/flow for reliable results.

Under review

Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.

Helpful resources

Quick Links

Introducing the 2026 Season 1 community Super Users

Congratulations to our 2026 Super Users!

Kudos to our 2025 Community Spotlight Honorees

Congratulations to our 2025 community superstars!

Congratulations to the April Top 10 Community Leaders!

These are the community rock stars!

Leaderboard > Copilot Studio

#1
Valantis Profile Picture

Valantis 639

#2
Vish WR Profile Picture

Vish WR 293

#3
Haque Profile Picture

Haque 216

Last 30 days Overall leaderboard