web
You’re offline. This is a read only version of the page.
close
Skip to main content

Announcements

News and Announcements icon
Community site session details

Community site session details

Session Id :
Power Platform Community / Forums / Copilot Studio / Copilot Studio agent r...
Copilot Studio
Answered

Copilot Studio agent returns incomplete results from SharePoint. How can I improve reliability?

(0) ShareShare
ReportReport
Posted on by 9
Hi Power Platform Community,
I am creating a Copilot Studio agent for my org's Accounting team. The team has a SharePoint site containing ~12k invoice PDF files contained in 8 folders. Each folder contains hundreds of invoice PDF for each fiscal year. I have also created a SharePoint List of all ~12k PDF files' metadata and indexed using vendor name, entity and fiscal year. I think this way Copilot Studio agent would actually find the relevant files based on user query. Unfortunately, I couldn't add invoice date, or invoice pay date as a column in the SharePoint List, since the team doesn't maintain that, even though they should.
 
Now our agent works well with queries like find this invoice PDF with this invoice number, or what was the total invoice amount for this invoice, provide me an invoice from this particular vendor for this fiscal year. However, when I ask it, give me all invoices for this vendor for May 2026, even though there are 25 invoices, it only gives me 8 invoices. So, it failed to give me a full picture. How would I improve the reliability of the agent? 

Implementation Details:
1. I used Invoice Document Library and Invoice SharePoint List as knowledge sources. 
2. Here is the instructions:
"""
You are a Finance Invoice Assistant.

Primary source:
- Use the SharePoint "Invoice_Full_List" as the main source for retrieving and matching invoices.
- Use the indexed list fields Vendor Name, Entity, and Fiscal Year as the primary matching fields.

Secondary source:
- Use the Invoice Library only when document-level details are required from a specific invoice file.
- Use the Invoice Library to extract fields that may only exist inside the document, such as Invoice Date, amount details, invoice number, or receipt content.

Retrieval behavior:
- First retrieve candidate invoices from Invoice_Full_List using available structured fields such as Vendor Name, Entity, Fiscal Year, or filename.
- Do not rely on the Invoice Library to list invoices.
- If the user asks for a specific invoice number or filename, match by Name first.

Date behavior:
- If the user asks for a month, date, or invoice date range, identify candidate files from Invoice_Full_List first, then inspect the corresponding document(s) in the Invoice Library to determine Invoice Date.
- Do not assume Modified date is the same as Invoice Date unless Invoice Date is unavailable.

Latest invoice behavior:
- If the user asks for the latest or most recent invoice, determine the latest matching invoice across the full available dataset, not just the first few returned items.

“All invoices” warning:
- If the user asks for “all” invoices, clearly state:
"Results may be incomplete due to retrieval limitations. Please manually verify in SharePoint if a complete accounting record is required."

Fallback behavior:
- If an exact match is not found, broaden the search gradually and return the closest relevant results instead of nothing.
- Do not ask unnecessary follow-up questions unless required.

Output format:
- Return results clearly and consistently.
- Include:
- Invoice filename
- Vendor Name
- Entity
- Fiscal Year
- Invoice Date (if found)
- Link to file
"""
3. I deployed on Teams, since end-users prefer using this chatbot from Teams.
I have the same question (0)
  • Verified answer
    Sunil Kumar Pashikanti Profile Picture
    2,318 Moderator on at
     
    This is expected behavior with Copilot Studio and RAG-based retrieval.
    The agent is not querying all 12K records. It retrieves only a small Top‑K set (typically 5–10 results) based on semantic ranking, which is why you see partial results like 8 instead of 25.

    Main issues:
    1. Retrieval limit (core reason)
    Copilot only brings back a limited number of top matches before answering. It cannot return full datasets reliably.
    2. Missing structured date field
    “May 2026” is not a structured column, so the model is guessing from document text. Semantic search is not reliable for exact date filtering.
     
    What to do:
    1. Add a Date (or Month/Year) column in SharePoint
    This is critical for accurate filtering.
    2. Use Power Automate for “list all” scenarios
    Call a flow from the agent and use OData queries to return complete results.
    3. Guide queries to be narrower
    Encourage filters like vendor + fiscal year + month.
     
    Summary:
    This is not a bot issue. It is a Top‑K retrieval limitation plus missing structured metadata.
    For complete and reliable results, use SharePoint filtering via Power Automate instead of pure Copilot retrieval.
     
    ✅ If one of the responses here solved your issue, please mark it as Accepted so others facing the same problem can benefit as well.
    👍 If this or any other reply here helped you, feel free to give it a Like. It helps others and is always appreciated.

    Sunil Kumar Pashikanti, Moderator
    Blog: https://sunilpashikanti.com/posts/
  • Assisted by AI
    Ashlesha-MSFT Profile Picture
    Microsoft Employee on at
     

    Hi there! Thanks for the detailed write-up — your setup is solid, but you're hitting two documented, by-design limits.

    First, the official Quotas and limits - Microsoft Copilot Studio | Microsoft Learn states SharePoint list queries only return data from the first 2,048 rows — so with 12k items in your list, invoices beyond row 2,048 are invisible to the agent regardless of instructions.

    Second, the generative answers knowledge source uses semantic search (top-K retrieval), not a database query — it returns the most relevant matches, not all records satisfying a filter. Asking for "all 25 invoices for Vendor X in May 2026" is a fundamental mismatch with how RAG works.

    To fix this reliably, replace the SharePoint List lookup with a Power Automate action that calls the SharePoint Get Items connector with an OData filter (e.g., Vendor_Name eq 'Contoso' and Fiscal_Year eq '2026'), using pagination to retrieve all matching rows, then pass that structured list back to the agent for formatting.

    Also note: if users access this agent in a Teams group chat or channel (not 1:1), SharePoint knowledge sources silently fail — this too is Connect and configure an agent for Teams and Microsoft 365 Copilot - Microsoft Copilot Studio | Microsoft Learn.The Power Automate approach with application-level auth resolves that as well.

     
  • Suggested answer
    RaghavMishra Profile Picture
    261 on at

    Hi there,

    Great write-up - your instructions are well structured. The behavior you're seeing ("all invoices for vendor X, May 2026" returns 8 of 25) is expected with generative knowledge retrieval, and it's mostly a limits/architecture issue rather than a prompt issue. Here's what's going on and how to make it reliable.

    Why "all" queries come back incomplete

    • A SharePoint knowledge source is built for generative answers - it retrieves the most relevant chunks for a query, not an exhaustive, complete list of every matching record. So "give me all 25" naturally returns only the top relevant subset.
    • Per the docs, there are also concrete ingestion limits to be aware of: the file upload knowledge source caps at 500 files, and analytical/structured questions over files "might not be optimal" because the agent can't write and run code over them.

    Improve search quality (Microsoft Learn recommendations)

    • For better SharePoint search results (and support for larger files), use a Microsoft 365 Copilot license in the same tenant as your agent and turn on Work IQ.
    • Without an M365 Copilot license in the same tenant, generative answers can only use SharePoint files under 7 MB due to memory limits - worth checking against your invoice PDFs.

    The reliable pattern for "give me all" / exhaustive lists

    • Don't rely on generative retrieval to enumerate completely. Instead, back the "all/list" scenarios with a deterministic action/tool that queries your Invoice_Full_List directly (for example a Power Automate flow or the SharePoint connector with an OData filter on Vendor + the date field), and return that result set.
    • This is exactly why your missing Invoice Date column matters - to filter reliably by "May 2026" you need a real date field in the list to filter on. Adding an Invoice Date column (even backfilled) will let a structured query return the complete 25.
    • Keep your existing generative setup for the "find/summarize a specific invoice" questions, where it already works well.

    References

    Found this helpful? Please mark ✅ "Does this answer your question?" so others searching for the same issue can find it quickly. A 👍 on "Was this reply helpful?" or a ♥ Like is also much appreciated!

    Raghav Mishra - LinkedIn | PowerAI Labs

  • Verified answer
    Assisted by AI
    PA-25061244-0 Profile Picture
    9 on at

    I resolved this by using SharePoint's Classify and Extract feature with Microsoft's prebuilt Invoice Processing model.

    After providing a few sample invoice PDFs, SharePoint automatically extracted metadata such as Vendor Name, Invoice Date, Invoice Total, Amount Due, and Invoice ID (when available) into SharePoint columns.

    This improved the Copilot Studio agent because it can now use structured metadata rather than relying entirely on PDF content retrieval.

    However, I still see limitations with large result sets. For example, when a vendor has 25 invoices, the agent may return only a subset of them. Because of this, I do not rely on the agent for complete accounting or audit records.

    To mitigate this, I updated the agent instructions to:


    • Prefer SharePoint metadata over document content.

    • Sort results by Invoice Date.

    • Never claim a response contains "all invoices" unless it can be verified.

    • Warn users that large result sets may be incomplete and should be validated directly in SharePoint.

    •  

    The solution now works well for finding invoices by vendor, invoice number, date range, and amount. For audits, reconciliations, or any request requiring a complete invoice listing, SharePoint remains the system of record and users are instructed to verify results directly in the document library.

    Updated Agent Instructions: 
    You are a Finance Invoice Assistant.

    Knowledge source:
    - Use the Invoice Library as the primary source.
    - Search invoice metadata first and document content only when necessary.

    Available fields may include:
    - VendorName
    - InvoiceDate
    - InvoiceTotal
    - AmountDue
    - InvoiceId
    - CustomerName
    - FiscalYear
    - Entity
    - File Name

    Rules:
    - Use metadata whenever available.
    - Use InvoiceDate, not Modified Date.
    - Match File Name first when provided.
    - Search InvoiceId and document content when invoice numbers are requested.
    - Never invent invoice values.
    - If a value cannot be found, state that it was not found.
    - Sort invoice results from most recent to least recent using InvoiceDate.
    - If InvoiceDate is unavailable, place those results at the end.
    - Never claim "all invoices", "every invoice", or "complete list" unless explicitly verified.
    - For large result sets, state that additional matching invoices may exist in SharePoint.

    Fiscal Year Rules:
    - The organization's fiscal year runs from November 1 through October 31.
    - FY2026 = November 1, 2025 through October 31, 2026.
    - FY2025 = November 1, 2024 through October 31, 2025.
    - When determining earliest, latest, oldest, most recent, first, or last invoices within a fiscal year, use InvoiceDate.
    - Do not assume January 1 is the start of a fiscal year.

    Response format:
    - File Name
    - Vendor Name
    - Invoice Date
    - Invoice Total
    - Amount Due
    - Fiscal Year (if available)
    - Entity (if available)


    Agent's model: GPT-4.1

    Screenshot 2026-0...

    Your file is currently under scan for potential threats. Please wait while we review it for any viruses or malicious content.

Under review

Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.

Helpful resources

Quick Links

Season of Sharing Community Challenge Launch!

Jump in, show your community spirit, and win prizes!

Kudos to our 2025 Community Spotlight Honorees

Expanding mentorship, skilling, and AI innovation

Congratulations to the May Top 10 Community Leaders!

These are the community rock stars!

Leaderboard > Copilot Studio

#1
Valantis Profile Picture

Valantis 277

#2
11manish Profile Picture

11manish 206

#3
sannavajjala87 Profile Picture

sannavajjala87 156 Super User 2026 Season 1

Last 30 days Overall leaderboard