Answered

ALM for Copilot Studio: Moving Uploaded File Knowledge Sources via Azure DevOps Pipelines

(0) Share

Report

Posted on by RodrigoCardosoDaSilva

Hi everyone,

My team is currently implementing an ALM strategy for our Copilot Studio agents. Our architecture involves a development environment (unmanaged solutions) and a production environment (managed solutions), orchestrated via Azure DevOps pipelines. While the pipeline works perfectly for standard agent topics and configurations, we are hitting a roadblock with agents that use uploaded files (PDF, DOCX, etc.) as Knowledge sources.

The Issue:

When the managed solution is imported into the production environment, the agent contains the references to the files, but not the actual file content. This results in:

Broken topics/Generative Answers.
Errors regarding missing attachments in the agent’s knowledge base.
Current workaround: We are manually re-uploading the files to the production environment after every deployment, which defeats the purpose of our automation.

Our Goal:

We want to automate the migration of these knowledge files within our Azure DevOps pipeline.

Questions:

Is there a specific Power Platform Build Tool task or a Dataverse Service Client command that handles the migration of the underlying Chatbot or Knowledge tables where these files are stored?
Has anyone successfully used the Configuration Migration Tool (CMT) or the Power Platform CLI (pac data export/import) within a pipeline to move these specific file-based dependencies?
Are there known limitations regarding how "Uploaded Files" are stored in Dataverse that prevent them from being moved via standard ALM?

Any guidance or documentation on how to include these data-bound assets in an automated deployment would be greatly appreciated!

Categories:

Bot administration

I have the same question (0)

All responses (2)

Answers (1)

Sort by

Verified answer

Assisted by AI

Elliot M. 19 on at

Like (1)

Report
Copy link

Link copied!

You've hit a known limitation — uploaded file knowledge sources are stored as document content in Dataverse, and solution packages transport component metadata and references, not the actual file binary content. This is why the references survive but the content doesn't.

There is no built-in solution task or PAC CLI command that migrates uploaded file content as part of a managed solution import today.

Here are your practical options, in order of recommendation:

1. Switch to SharePoint as your knowledge source. This is the cleanest ALM path. Upload your documents to a SharePoint document library, then point your agent's knowledge to the SharePoint URL instead of uploaded files. SharePoint knowledge sources are referenced by URL — the content lives in SharePoint, not in Dataverse, so there's nothing to migrate. Both dev and prod agents point to the same (or environment-specific) SharePoint site. This is what Microsoft recommends for production agents.

2. Switch to public website URLs. If your documents are published internally (e.g., on an intranet or Learn site), you can use website URLs as knowledge sources. Same benefit — the reference travels with the solution, the content lives externally.

3. Post-deployment script using Power Platform CLI. If you must use uploaded files, you can automate the re-upload step in your pipeline. After the managed solution import step, add a pipeline task that uses the PAC CLI or Dataverse Web API to upload the files programmatically. This isn't elegant, but it closes the gap. You'd store your source documents in your repo and script the upload against the target environment.

4. Configuration Migration Tool (CMT). CMT can export/import Dataverse data including file attachments. In theory, you could use it to transfer the knowledge file records. However, I'd test this carefully — the knowledge source indexing may not automatically trigger after a CMT import, which could leave you with files present but not searchable.

My recommendation: Option 1 (SharePoint). It eliminates the problem entirely, works better at scale, respects document permissions via Entra ID auth, and is the pattern Microsoft is investing in for enterprise knowledge.

Ref: https://learn.microsoft.com/en-us/microsoft-copilot-studio/knowledge-copilot-studio
Ref: https://learn.microsoft.com/en-us/microsoft-copilot-studio/authoring-export-import-bots

Was this reply helpful? Yes No
RodrigoCardosoDaSilva 20 on at

Like (0)

Report
Copy link

Link copied!
Hi Elliot! Thanks for the detailed response.

Regarding your suggestions, I have a few specific considerations:

"1. Switch to SharePoint as your knowledge source.":

In our case, we are deploying a public agent via a public URL. Because the agent is unauthenticated, it cannot pass a user token to SharePoint. Since SharePoint knowledge sources require Entra ID authentication to respect row-level permissions, the agent wouldn't be able to retrieve any documents. Therefore, SharePoint doesn't seem to be a viable option for this architecture, correct?

"2. Switch to public website URLs." and "3. Post-deployment script using Power Platform CLI.":

Option 2 seems bulletproof and is our current "Plan B." However, we’d prefer to keep the files managed within the solution lifecycle if possible. I'm going to attempt Option 3 first by implementing a pipeline task using the PAC CLI / Dataverse Web API. I’ll update this thread with my results if I can get the binaries to move successfully.

"4. Configuration Migration Tool (CMT).":

I confess I discarded this option before I could think of the impact on the automatic indexing process, but that definitely raises another red flag for this option. About this option, here is what I found, in case someone finds it useful:

I first looked into the CMT tool, which can be accessed from PowerShell with the command:

pac tool cmt

To use it, you first need to define a schema to export the data. After some investigation, I found that the files are included in the "Bot Component" table (botcomponent). Inside this table, the files are stored in records with the ComponentType attribute equal to "Bot File Attachment". These records also have an attribute called filedata, where I believe the binaries are stored (type "file" according to the documentation available at https://learn.microsoft.com/en-us/power-apps/developer/data-platform/reference/entities/botcomponent).

However, using the UI of the tool and selecting the "Default Solution" and this "botcomponent" entity, the filedata attribute is not shown as an available field. I even tried to create a custom schema to include this field, but the export operation only carries the metadata, not the binaries. It looks like a deliberate limitation of the CMT tool, probably for security reasons.

Was this reply helpful? Yes No