If you have added a public website URL (like /content/dam/.../documents/) as a Knowledge Source, and it shows "Ready" but returns "No information found," you are likely hitting a Crawl Discovery Gap.
The Root Cause: Crawlers are Link-Followers, not File-Explorers
Copilot Studio’s public website crawler is designed to mimic a human browsing a site. It follows HTML links (<a> tags) to find content.
- HTML Pages: Easily discoverable via navigation.
- DAM/Binary Folders: These are "Asset Stores." They usually lack an HTML interface.
- The Result: The crawler hits your folder URL, sees a blank response (because directory browsing is disabled on the server), and assumes there is nothing to index. It cannot "guess" the filenames of your PDFs.
How to Fix It (Proven Options)
Option 1: The "Index Page" (Fastest Low-Code Fix)
Create a simple HTML landing page (e.g., yoursite.com/support/docs) that contains direct links to every PDF you want indexed.
Why it works: When the crawler hits this page, it sees the links, follows them, and begins indexing the PDF content.
Tip: Ensure the links are standard <a href="..."> tags and not hidden behind JavaScript buttons.
Option 2: Upload Files Directly
If your document set is under 500 files and individual files are smaller than 20MB:
Go to: Knowledge > Add Knowledge > Files.
Why it works: This bypasses the crawler entirely. Copilot Studio will immediately chunk and index the full text of the PDFs.
Option 3: SharePoint Integration
If your PDFs are internal or sensitive, move them to a SharePoint Document Library.
Why it works: Copilot Studio uses the Microsoft Graph API for SharePoint, which performs a direct "file crawl" rather than a "web crawl." It is significantly more reliable for deep directory structures.
Option 4: The XML Sitemap (Advanced)
If you cannot create a public HTML page, add the direct URLs of every PDF to your site’s sitemap.xml.
Why it works: The Copilot crawler checks the sitemap to find "deep links" it might have missed during the standard crawl.
What will NOT work:
Waiting longer: If it hasn't indexed in 24 hours, it never will because it can't find the path.
Changing the Prompt: This is a data-source issue, not a language-model issue.
Adding more sub-folders: More folders only make it harder for a crawler to guess the path.
Bottom Line: A web crawler needs a map (HTML links). If you point it at a "closed" folder, it will report as "Ready" (because the URL works) but index zero documents.
✅ If this answer helped resolve your issue, please mark it as Accepted so it can help others with the same problem.
👍 Feel free to Like the post if you found it useful.