Hi @JM-20021632-0
Let's see if these steps help:
For setting up text evaluation from image in copilots, first we need to enable capacity to upload images, to do so
1. Please make sure CSE (Copilot Studio Environment) has provision image uploads as input to the agent. For this we need to configure the input schema or interface to accept image files (png/jpg)
2 Let's have our OCR service/API is in place to integrate that hepls to extract text from image, possible options for this action are Azure Cognitive Services CV-OCR, MS Read API or any opensouce OCR library (if running locally). For this, we need to set up API keys and permissions for the OCR services.
3. In Copilot Studio, let's implement image processing logic (backend or workflow) to receive the uploaded image, send the image to the OCR service and receive extracted text results.
4. Once text is extracted, send this text to agent's eveluation logic. We can leverage here prompt engineering or custom logic to analyze, summarize, or validate the extracted text.
5. To get the results to the user, we can format the agent's output on the basis of text evaluation. We can provide feedback or insights deduced from the image text.
If the above pipeline works, we can test differently:
- To ensure OCR accuracy, let's test with various image qualities and text styles.
- Let's handle any error like OCR failures and unreadable images.
Pleae let me know if this helps.