I’m seeing inconsistent behavior with the “Capability use” test method in evaluation of test case methods in Copilot Studio.
Even when my tool is successfully invoked (visible in the Trace and returning correct output),
the test case shows Capability use = Fail.
If I delete the test case and recreate it (with the same question and expected result),
Capability use suddenly shows Pass. No changes to the agent or the tool.
This looks like the test case is caching older capability–tool mappings and only refreshes
when the test case is recreated.
Steps:
1. Agent with a tool/ connector eg: (get current weather/ get current time)
2. Create a test case using Capability use and adding tool get current weather.
4. Run test → tool is invoked → but Capability use = Fail.
5. Delete and recreate test → Capability use = Pass.
In the screenshot added, once i readded the test case the capability started working again and working for the first time.
Expected:
Capability use should pass whenever the mapped tool is invoked.
Is this a known issue or is there a workaround to force capability metadata to refresh
without recreating the test every time ?

Report
All responses (
Answers (