web
You’re offline. This is a read only version of the page.
close
Skip to main content

Announcements

News and Announcements icon
Community site session details

Community site session details

Session Id :
Power Platform Community / Forums / Copilot Studio / Crawlers used for Copi...
Copilot Studio
Suggested Answer

Crawlers used for Copilot Studio and Custom Bing Searches

(0) ShareShare
ReportReport
Posted on by
From what I can find online Bingbot is used for crawling websites for Bing search, and so it feels safe to assume that results and content I am getting from a custom bing search is coming from Bingbot.

However, I want to know if a Copilot Studio agent use something different?
 
I have an issue where the custom bing search is returning results as desired, including ranking by relevance the correct domains I have setup. But when putting the same search string or keywords into my agent, it cannot return results from the top domain referenced by the bing search. Not matter what I do, I can't get the agent to use a certain domain and that includes adding that site as direct knowledge for the agent.
Categories:
I have the same question (0)
  • Suggested answer
    Beyond The Platforms Profile Picture
    225 on at
    This is a very common point of confusion, and your observation is correct: Copilot Studio does NOT behave the same way as a Custom Bing Search, even if both rely on Bing under the hood.
     
    Key difference: Crawling vs Retrieval behavior
     
    Custom Bing Search:
    - Uses Bing indexing (via Bingbot)
    - Returns ranked search results based on relevance and domain weighting rules
    - You can influence ranking by prioritizing specific domains
     
    Copilot Studio agent:
    - Does NOT execute a classic “search result ranking query”
    - Uses a retrieval + summarization approach (RAG)
    - Pulls content from:
      - Indexed knowledge sources (if added)
      - Bing (if web search is enabled)
    - Then the LLM decides what content to use when generating the answer
     
    This leads to the behavior you are seeing:
    - Your domain is correctly ranked in Bing
    - But the agent does not always select or include it in the answer
    - Even if the same query is used
     
    Why your domain is not being returned
     
    There are several important reasons:
     
    1) LLM-controlled selection
    Even when Bing returns your domain, the agent does not blindly use top-ranked results. The model selects sources based on:
    - Perceived relevance to the question
    - Content quality and extractability
    - How well the content answers the intent
     
    So “top result in Bing” ≠ “used by Copilot”
     
    2) Knowledge vs Web search priority
    If you added the site as Knowledge:
    - It must be correctly indexed and chunked
    - Content must be accessible and parseable
    - The agent may still prefer other sources if the content is unclear or not directly answerable
     
    3) Content structure issues
    If your domain:
    - Is heavily dynamic
    - Requires authentication
    - Uses complex rendering (JS-heavy pages)
    then the crawler/indexer may not extract usable content, even if Bing ranks it well.
     
    4) Generative answer behavior
    Copilot Studio generates answers, it does not return links. If the model cannot confidently use your content to build an answer, it may ignore it entirely.
     
    How to fix / improve the behavior
     
    1) Force stronger grounding in your source
    In your system instructions or prompts, add:
    “Prefer and prioritize information from [your domain] when answering. Only use other sources if the information is not available there.”
     
    2) Validate Knowledge ingestion
    If using the site as a knowledge source:
    - Ensure pages are publicly accessible
    - Test with very specific queries
    - Check if the content is actually retrieved (debug with narrower prompts)
     
    3) Make content more “LLM-friendly”
    Ensure the pages:
    - Contain clear, structured text
    - Have direct answers (not only navigation or UI elements)
    - Avoid heavy reliance on client-side rendering
     
    4) Reduce ambiguity in prompts
    The more generic the question, the more the model will diversify sources.
    Use specific prompts like:
    “According to [your domain], explain …”
     
    5) Combine approaches
    For maximum control:
    - Keep your domain as Knowledge (primary source)
    - Use prompts to explicitly reference it
    - Avoid relying only on Bing ranking
     
    Summary
     
    - Copilot Studio does not use Bing ranking in a deterministic way
    - It uses retrieval + LLM reasoning, not search result ordering
    - Being top-ranked in Bing does not guarantee usage by the agent
    - You must guide the model and ensure your content is properly indexed and usable
     
    This is expected behavior and by design, not a bug.
     
    Hope this helps!
    Paolo


    Did this solve your issue? → Accept as Solution
    👍 Partially helpful? → Click "Yes" on "Was this reply helpful?" or drop a Like!


    Want more tips on Power Platform & AI? Follow me here:

    🔗 LinkedIn: https://www.linkedin.com/in/paoloasnaghi/
    ▶️ YouTube: https://www.youtube.com/@BeyondThePlatforms
    📸 Instagram: https://www.instagram.com/beyond_the_platforms/
    🌐 Website: https://www.beyondtheplatforms.com/


     
  • Suggested answer
    AP-26031104-0 Profile Picture
    Microsoft Employee on at
    Hi,
    On the crawler question specifically:
    Copilot Studio uses Microsoft's own content fetcher (not Bingbot) when indexing knowledge sources you add directly. For web search via Bing, it uses Bing's index — but as noted, ranking ≠ usage by the agent.
     
    Additional things to check:
    1. Generative answers source settings — In Copilot Studio, go to your agent > Settings > Generative AI and confirm that the knowledge source (your domain) is listed and shows as indexed successfully. A green status doesn't always mean content was extracted correctly.
    2. Test the knowledge source directly — In the agent's Test panel, ask a very specific question that has a clear answer only on your domain. If it still doesn't return it, the content may not be chunked/extractable properly.
    3. Check for content access issues — If your domain uses redirects, login walls, or heavy JavaScript rendering, the indexer may have crawled it but extracted no usable text.
    4. Blocked domain list — Rarely, certain domains may be filtered due to content policies. If the domain is new or low-authority, it may be deprioritized.
    If none of the above resolves it, I'd recommend raising a Microsoft Support ticket.
  • Romain The Low-Code Bearded Bear Profile Picture
    2,876 Super User 2026 Season 1 on at

    hello there :)

    I would be careful not to assume that Copilot Studio behaves exactly like the raw Bing Custom Search test experience.

    Bingbot is used to crawl and index websites for Bing, so yes, the content returned by Bing Custom Search is based on Bing-indexed content. However, a Copilot Studio agent does not simply take the Bing Custom Search ranking and expose it directly in the conversation.

    Things very important to my opininon : -> At runtime, Copilot Studio adds its own orchestration layer on top of the search results.

    The agent may rewrite the user query, use the conversation context, select or filter knowledge sources, apply grounding checks, provenance checks, semantic similarity checks, and then generate the final answer. Because of that, the final sources used by the agent may not match the raw ranking you see in Bing Custom Search.

    Another important distinction is how the website is configured in Copilot Studio.

    If you add a public website as a knowledge source, Copilot Studio uses web grounding based on Bing Search. That is not necessarily the same thing as using your Bing Custom Search configuration.

    If you want the agent to use your Bing Custom Search instance specifically, you need to configure it in a generative answers node using the Bing Custom Search configuration ID. Simply adding the site as direct knowledge may not force the agent to use that domain in the way your Custom Bing Search test does.

    I would also check whether general web search is enabled on the agent. If it is enabled, the agent may search across public Bing-indexed websites and mix those results with your configured knowledge sources, which can make the behavior different from your custom search setup.

Under review

Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.

Helpful resources

Quick Links

Season of Sharing Community Challenge Launch!

Jump in, show your community spirit, and win prizes!

Kudos to our 2025 Community Spotlight Honorees

Expanding mentorship, skilling, and AI innovation

Congratulations to the May Top 10 Community Leaders!

These are the community rock stars!

Leaderboard > Copilot Studio

#1
Valantis Profile Picture

Valantis 249

#2
Romain The Low-Code Bearded Bear Profile Picture

Romain The Low-Code... 180 Super User 2026 Season 1

#3
Vish WR Profile Picture

Vish WR 153

Last 30 days Overall leaderboard