Answered

Custom AI Model vs putting OCR conditions

(0) Share

Report

Posted on by tareenmj

I have a flow, in which a user submits a screenshot via Microsoft Flows. These screenshots contain statistics. The issues is that the screenshots can vary significantly in not only position of relevant stats but also the amount of stats. Examples of the screenshots are as follows: (i) Samsung 1

Samsung 1

(ii) Samsung 2:

Samsung 2

(iii) iPhone:

Dev7 - Service mode.jpg.png

You can see these screenshots have different content and positions. These are just some of the images (other devices like LG, etc). Currently, I'm doing OCR and storing it into an array and then looping through the array to find relevant information. It's just a bit difficult since the OCR array for every image can vary dramatically and garnering relevant information is a bit difficult.

Since I don't have a lot of experience in AI Builder, is this task more suited for the AI Builder or should I stick with my current process of OCR and extract information.

Thanks for the help.

Categories:

General topics

I have the same question (0)

All responses (5)

Answers (2)

jinivthakkar 4,187 on at

Like (1)

Report
Copy link

Link copied!

@tareenmj I have worked with AI in the past, I can say for sure using OCR is going to be super painful. I had a similar use case where the information was random every time and I used Azure Custom Form Recognizer which is super powerful 1.5 yrs back then AI Builder was not that good but now it has also improved a lot.

Now even Azure Custom Form recognizer has become more powerful. We used Azure because it was much cheaper and more powerful.

You need to do analysis for your use case and then take a decision, I am still inclined towards Azure due to its ease of use and integration

Also there is a dedicated forum for AI, you will get precise information as the AI team itself responds there

https://powerusers.microsoft.com/t5/AI-Builder/bd-p/AIBuilder

--------------------------------------------------------------------------------

If this post helps answer your question, please click on “Accept as Solution” to help other members find it more quickly. If you thought this post was helpful, please give it a Thumbs Up.

Was this reply helpful? Yes No
Verified answer

JoeF-MSFT Microsoft Employee on at

Like (2)

Report
Copy link

Link copied!
Hi @tareenmj - it's great to see you pushing forward to automate this process. 🙂 And thanks @jinivthakkar for sharing your experience!

The good thing is that AI Builder Form Processing is built on top of Azure Form Recognizer. Same great AI technology in both cases. AI Builder gives you in addition an intuitive user experience and seamless integration with Power Automate and the rest of the Power Platform.

We can try the following for this use case:

Create a new AI Builder Form Processing model: Create a form processing custom model - AI Builder | Microsoft Docs

Define the difference fields you want to extract from the screenshots.

Upload at least 5 sample screenshot of each type, group them by collection. (1 collection for Samsung, 1 collection for iPhone...) https://docs.microsoft.com/en-us/ai-builder/create-form-processing-model#group-documents-by-collections

Tag and train the document.

Once the model has been trained, you can test the model with new screenshots and see if it's able to extract the data. If you try this, I'd be super interested in hearing back how it works.

Was this reply helpful? Yes No
tareenmj 60 on at

Like (0)

Report
Copy link

Link copied!

Thank you to you both @jinivthakkar and @JoeF-MSFT for your help. It is greatly appreciated! This is so helpful and I will try training an AI builder with 3 different collections and checking if the AI model correctly detects information. The problem that I faced when using AI builder in the past was that it seemed to extract information from the screenshot even if it wasn't there. An example was if the model was trained for detecting 'variable A' and I uploaded a screenshot which didn't have any mention of 'variable A', it would grab anything from the screenshot, whereas it should have been blank.

I was thinking if there is an AI builder in Power Automate which can detect values from text. Meaning if I complete OCR on the image, store it in a string and then perform AI on that string? Do you believe this would be a better approach to avoid any edge or unfamiliar cases?

Was this reply helpful? Yes No
Verified answer

JoeF-MSFT Microsoft Employee on at

Like (1)

Report
Copy link

Link copied!

Hi @tareenmj! For randomly extracted data, you can check the confidence score of the value. If the confidence score is low, you can discard the result.

On detecting values from text, one possible approach could be to use Entity Extraction: Entity extraction custom AI model overview - AI Builder | Microsoft Docs

Was this reply helpful? Yes No
tareenmj 60 on at

Like (0)

Report
Copy link

Link copied!

Very helpful. In your opinion, should I stick with image extraction with three different categories or should I go with entity extraction? The only issue I have with entity extraction is the text data can look significantly different after I do OCR (i.e. different screenshots from different devices will look differently).

Was this reply helpful? Yes No