Skip to main content

Notifications

Community site session details

Community site session details

Session Id : WPma9OP1yZpCOf4t2/7DGS
Power Automate - Power Automate Desktop
Unanswered

Extract Text with OCR - Use another language to specify "trained" data

Like (0) ShareShare
ReportReport
Posted on 27 Oct 2021 22:33:46 by 4

Hi,

 

I'm using the "Extract text with OCR" action to read text from a grid in an application running under Citrix. I'm getting OK-ish results after following the suggestions in this post - https://powerusers.microsoft.com/t5/Power-Automate-Desktop/Move-mouse-to-text-on-Screen-OCR-worked-yesterday-now-dont-work/m-p/1197401#M5294 but still not good enough.

 

The text that i need to read (and then compare to an input variable so it needs to be exact) aren't real words in any language.  It also has lots of slashes e.g. C560XL/XLS/IR. I was thinking that if i could train the engine with this text then i would get better results. Power Automate Desktop has the "Use Other Language" option and allows you to set the language data path. I can't find any instructions on how this works and i keep getting "Failed to create the OCR engine" error.

 

Can anyone give me more information on this feature? Will it do what i want it to do? How do i specify the data file?

 

I've tried downloading the language data from here - https://github.com/tesseract-ocr/langdata but i'm clearly doing something wrong ...

 

cjibb02_0-1635373445893.png

 

Any help will be much appreciated!

  • cjibb02 Profile Picture
    4 on 01 Nov 2021 at 13:14:55
    Re: Extract Text with OCR - Use another language to specify "trained" data

    Thanks, yes I've followed the steps in that link but i always get "Failed to create OCR engine" even when i use the language files downloaded from github - https://github.com/tesseract-ocr/langdata. I have a support ticket open with Microsoft for this and will update with what they come back with. 

     

    I haven't tried the Microsoft OCR Cognitive action yet but will take some screen shots and run these through that service. There will be challenges identifying and cropping the part of the screen i want to run the OCR on, saving as an image and then parsing the results. So I'm not sure if it will be practical but will be interesting how it compares to Tesseract over the whole screen image.

  • fraenK Profile Picture
    2,125 on 29 Oct 2021 at 19:36:00
    Re: Extract Text with OCR - Use another language to specify "trained" data

    For the additional language did you try this? https://docs.microsoft.com/en-us/power-automate/desktop-flows/how-to/ocr-multilingual-documents

    BUT the built-in OCR functionality based on Tesseract is not that great.

    Did you try the Microsoft cognitive action for OCR?

    https://docs.microsoft.com/en-us/power-automate/desktop-flows/actions-reference/microsoftcognitive#ocrmicrosoft

     

    Or would there be a chance to export the grid content as text from the application itself or take a screenshot and run it through a more advanced 3rd party OCR tool install PAD within Citrix?

     

    Unfortunately other RPA product are more advanced with Citrix based automation.

Under review

Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.

Helpful resources

Quick Links

Understanding Microsoft Agents - Introductory Session

Confused about how agents work across the Microsoft ecosystem? Register today!

Warren Belz – Community Spotlight

We are honored to recognize Warren Belz as our May 2025 Community…

Congratulations to the April Top 10 Community Stars!

Thanks for all your good work in the Community!

Leaderboard

#1
WarrenBelz Profile Picture

WarrenBelz 146,745 Most Valuable Professional

#2
RandyHayes Profile Picture

RandyHayes 76,287 Super User 2024 Season 1

#3
Pstork1 Profile Picture

Pstork1 66,091 Most Valuable Professional

Leaderboard
Loading started