web
You’re offline. This is a read only version of the page.
close
Skip to main content

Announcements

News and Announcements icon
Community site session details

Community site session details

Session Id :
Power Platform Community / Forums / Power Apps / Copilot Studio public ...
Power Apps
Suggested Answer

Copilot Studio public website knowledge source returning "No information was found"

(0) ShareShare
ReportReport
Posted on by 4
Hi all,
 
I'm building a Copilot Studio agent for our company's support site that retrieves product documentation PDFs from their public DAM. I've set up a public website knowledge source pointing to: https://www.solidigm.com/content/dam/solidigm/en/site/products/documents/
 
When users ask for a document through a Search and Summarize node, the agent consistently returns "No information was found."
 
The files I'm trying to retrieve are publicly accessible PDFs sitting directly under that path. The knowledge source status shows as "Ready."
 
Has anyone successfully used a public website knowledge source to retrieve PDFs from a DAM-style path like this? Any advice on configuration, crawl behavior, or troubleshooting would be appreciated.
 
Categories:
I have the same question (0)
  • Vish WR Profile Picture
    1,102 on at
     
    Are those PDFs placed on the website as links in the pages ?, or is it only the website content folder?
  • Suggested answer
    Sunil Kumar Pashikanti Profile Picture
    1,737 Moderator on at
     
    If you have added a public website URL (like /content/dam/.../documents/) as a Knowledge Source, and it shows "Ready" but returns "No information found," you are likely hitting a Crawl Discovery Gap.
     
    The Root Cause: Crawlers are Link-Followers, not File-Explorers
    Copilot Studio’s public website crawler is designed to mimic a human browsing a site. It follows HTML links (<a> tags) to find content.
    • HTML Pages: Easily discoverable via navigation.
    • DAM/Binary Folders: These are "Asset Stores." They usually lack an HTML interface.
    • The Result: The crawler hits your folder URL, sees a blank response (because directory browsing is disabled on the server), and assumes there is nothing to index. It cannot "guess" the filenames of your PDFs.
    How to Fix It (Proven Options)
    Option 1: The "Index Page" (Fastest Low-Code Fix)
    Create a simple HTML landing page (e.g., yoursite.com/support/docs) that contains direct links to every PDF you want indexed.
    Why it works: When the crawler hits this page, it sees the links, follows them, and begins indexing the PDF content.
    Tip: Ensure the links are standard <a href="..."> tags and not hidden behind JavaScript buttons.
     
    Option 2: Upload Files Directly
    If your document set is under 500 files and individual files are smaller than 20MB:
    Go to: Knowledge > Add Knowledge > Files.
    Why it works: This bypasses the crawler entirely. Copilot Studio will immediately chunk and index the full text of the PDFs.
     
    Option 3: SharePoint Integration
    If your PDFs are internal or sensitive, move them to a SharePoint Document Library.
    Why it works: Copilot Studio uses the Microsoft Graph API for SharePoint, which performs a direct "file crawl" rather than a "web crawl." It is significantly more reliable for deep directory structures.
     
    Option 4: The XML Sitemap (Advanced)
    If you cannot create a public HTML page, add the direct URLs of every PDF to your site’s sitemap.xml.
    Why it works: The Copilot crawler checks the sitemap to find "deep links" it might have missed during the standard crawl.
     
    What will NOT work:
    Waiting longer: If it hasn't indexed in 24 hours, it never will because it can't find the path.
    Changing the Prompt: This is a data-source issue, not a language-model issue.
    Adding more sub-folders: More folders only make it harder for a crawler to guess the path.
     
    Bottom Line: A web crawler needs a map (HTML links). If you point it at a "closed" folder, it will report as "Ready" (because the URL works) but index zero documents.
     
    ✅ If this answer helped resolve your issue, please mark it as Accepted so it can help others with the same problem.
    👍 Feel free to Like the post if you found it useful.

    Sunil Kumar Pashikanti, Moderator
    Blog:
     https://sunilpashikanti.com/posts/
  • CT-20042235-0 Profile Picture
    4 on at
     
    Yes, the links to the PDFs can be found on our Document Management System page. 

    https://www.solidigm.com/products/document-management-system.html
  • CT-20042235-0 Profile Picture
    4 on at
     
    Thank you for the detailed response.
     
    I've since confirmed that the PDFs are linked from two places on our site:
    1. The Document Management System page at: https://www.solidigm.com/products/document-management-system.html
    2. Individual product pages across the site:
      1. Example 1: https://www.solidigm.com/products/data-center/d7/ps1010.html
      2. Example 2: https://www.solidigm.com/products/data-center/d7/p5810.html
    I have my agent's knowledge source pointed at www.solidigm.com, but it is still struggling to find these documents.
    To try to improve retrieval, I attempted to narrow the knowledge source specifically to the DMS page in a topic dedicated to document retrieval. However I ran into a couple of issues:
    1. The knowledge source URL field doesn't appear to accept a .html file extension, so I'm unable to point it directly at https://www.solidigm.com/products/document-management-system.html. Our web team is working on setting up a redirect from the extensionless URL to the .html version. Would a redirect work for the crawler, or does it need to hit the final destination URL directly?
    2. Looking at the page source for the DMS page, the document table appears to be powered by the DataTables library. Could this cause an issue where the crawler sees an empty table because the data is loaded dynamically via JavaScript after page load, rather than being server-rendered in the HTML?
     
    When testing the agent with this configuration, the Search and Summarize node returns "No information was found that could help answer this", suggesting the knowledge source is not returning any content despite the knowledge source status showing as "Ready."
     
    For reference I've attached a simplified version of the topic YAML showing the Search and Summarize node pointed at the DMS knowledge source.

    Any guidance would be appreciated.

Under review

Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.

Helpful resources

Quick Links

Introducing the 2026 Season 1 community Super Users

Congratulations to our 2026 Super Users!

Kudos to our 2025 Community Spotlight Honorees

Congratulations to our 2025 community superstars!

Congratulations to the March Top 10 Community Leaders!

These are the community rock stars!

Leaderboard > Power Apps

#1
11manish Profile Picture

11manish 490

#2
WarrenBelz Profile Picture

WarrenBelz 427 Most Valuable Professional

#3
Vish WR Profile Picture

Vish WR 381

Last 30 days Overall leaderboard