web
You’re offline. This is a read only version of the page.
close
Skip to main content

Notifications

Announcements

Community site session details

Community site session details

Session Id :
Power Platform Community / Forums / Copilot Studio / Image Input and Analys...
Copilot Studio
Unanswered

Image Input and Analysis - How does it work?

(2) ShareShare
ReportReport
Posted on by 28
Hello, 
I am working on Microsoft Copilot Studio to make a custom copilot that serves as a guide for an application and I want to enable image analysis. I have searched for the documentation on the functionality, but it doesn't clarify how it works. For it to analyze images, does the sources of knowledge need to have similar images to associate it to the information stored within a document? Does the images on a document can help with the analysis or does it work differently? I would appreciate if someone could tell me how this works exactly as I don't precisely got much from the documentation. I am not sure if I need to train my copilot once again with documents with images so that it can better analyze the ones uploaded by the user or if the IA alone can handle the image to find what's the information that needs to return according to the image's content. 
Categories:
I have the same question (0)
  • Verified answer
    Michael E. Gernaey Profile Picture
    53,335 Super User 2025 Season 2 on at
     
    You do not have to do that. You have to make sure Generative AI is turned on to use it OOB.
     
    You can also turn on General knowledge too, but essentially no you can just use them.
     
    If this helps you I'd appreciate if you Marked it as Resolved and maybe a like :-)
     
    Cheers
  • Suggested answer
    ronaldwalcott Profile Picture
    3,847 Super User 2025 Season 2 on at
    Essentially when it comes to using AI models you first test the available models to determine if they provide the expected results. If they don't work as you expect then you either have to train your own models or search for a model which works with your test cases.
  • Verified answer
    SherriS Profile Picture
    19 on at
    Hi there, 
     
    You didn't mention how you're adding the image recognition into your model. The answer depends on what model you're adding to Copilot Studio to accomplish the computer vision task, if any. There are different ways to accomplish this task, with different answers to your question depending upon which way you decide to do this.   
     
    Uploading Images to Copilot Studio (for example as a knowledge source):
    If you're using Copilot Studio's built-in image recognition by uploading images as a knowledge source, for example, you're right: I can't find in the documentation what kind of model is being used for the image recognition task, either. But I suspect it's a ViT, or similar model structure, appended to the front end of the Transformer model that they're using for the LLM. You can look that model architecture up, but basically it's a way to convert images into vectors for the model to use just like it uses text vectors to generate predictions. These are trained on a giant corpus of images, just like the LLMs are trained on a giant corpus of text. If you have a general use case (i.e., the types of things that a model who had seen all the images on the internet), your model will be good at recognizing your images out of the box. There isn't a low-code / no-code way that I'm aware of to re-train or fine-tune these images. You can do it in a full code way, but it would be very, very expensive, just as it would be to retrain or fine tune an LLM. You're talking similar amounts of data and compute time as you would to retrain or fine-tune an LLM.
     
    Using Power Automate to Add a Custom Gen AI Prompt
    Same thing applies if you're using Power Automate to incorporate a Gen AI prompt.  (You can find these in the AI Hub by clicking on prebuilt prompts or you can create your own generic prompt and pull it into Power Automate.)  If you use one of these and pull it into your model using Power Automate, all of the above applies. My guess is it's probably also a ViT.
     
    Using Power Automate to Include an AI Builder Model
    You can also incorporate an AI Builder Model into your Copilot Studio model using Gen AI. You would need to create one of the computer vision AI Builder models in the AI Hub and then pull it into Copilot Studio using the one of the AI Builder actions in Power Automate. If you have very specific images that the Gen AI models aren't doing well at recognizing (the vision equivalent of needing RAG is for text), then you might benefit from adding the computer vision task this way because you get access to fine-tuning a custom model to recognize your specific images. These are the good old Convolutional Neural Networks (CNNs), probably with some additional architecture that's proprietary to Microsoft. 
     
    I hope this is helpful! If it is, I'd really appreciate if you would mark this answer as correct and give it a like!

Under review

Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.

Helpful resources

Quick Links

Forum hierarchy changes are complete!

In our never-ending quest to improve we are simplifying the forum hierarchy…

Ajay Kumar Gannamaneni – Community Spotlight

We are honored to recognize Ajay Kumar Gannamaneni as our Community Spotlight for December…

Leaderboard > Copilot Studio

#1
Michael E. Gernaey Profile Picture

Michael E. Gernaey 255 Super User 2025 Season 2

#2
Romain The Low-Code Bearded Bear Profile Picture

Romain The Low-Code... 205 Super User 2025 Season 2

#3
S-Venkadesh Profile Picture

S-Venkadesh 101 Moderator

Last 30 days Overall leaderboard