web
You’re offline. This is a read only version of the page.
close
Skip to main content
Community site session details

Community site session details

Session Id :
Power Platform Community / Forums / Power Automate / Extract text from PDF ...
Power Automate
Unanswered

Extract text from PDF without external (non-Microsoft) connectors

(0) ShareShare
ReportReport
Posted on by 8

Hi,

 

I need to extract the full text (no layout needed) from PDF files without using third party connectors (Plumsail, Parser et al) as this is a GDPR and security issue (besides being insanely priced if you need to do the operation on a large number of files). I have temporarily solved it with my own PowerShell Azure Function that pipes the incoming PDF through a commandline tool and return the text after -replacing away unwanted junk characters. However this is a little less maintainable than solving it all through flow.


Any creative suggestions on how to achieve it?

 

Cheers!

Categories:
I have the same question (0)
  • v-zhos-msft Profile Picture
    on at
    Re: Extract text from PDF without external (non-Microsoft) connectors

    Hi @MS2 ,

    I am afraid that there is no way to achieve your needs in Microsoft Flow currently.

    There is a similar idea with your issue, you can vote here:

    https://powerusers.microsoft.com/t5/Flow-Ideas/Convert-PDF-to-Text-Table-Image/idi-p/176806?advanced=false&collapse_discussion=true&filter=location&location=category:FL_Comm_Ideas&q=convert%20pdf%20to%20text&search_type=thread

    Best Regards,

    Community Support Team _ Zhongys

    If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

  • MS2 Profile Picture
    8 on at
    Re: Extract text from PDF without external (non-Microsoft) connectors

    As you can search for text in PDF's both in OneDrive and Sharepoint some part of those backends is able to reach the text in PDF's, there are no ways of using that through REST or other to "get the text out"?

  • v-zhos-msft Profile Picture
    on at
    Re: Extract text from PDF without external (non-Microsoft) connectors

    Hi @MS2 ,

    I have made some tests with the OneDrive and Sharepoint.

    I get the file content from a pdf, then create a .txt file with the content.

    However, it will return a bunch of garbled in the .txt file.

    If you want to achieve your needs, I am afraid you need to use the Plumsail connector.

    Best Regards,

    Community Support Team _ Zhongys

    If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

     

  • Community Power Platform Member Profile Picture
    on at
    Re: Extract text from PDF without external (non-Microsoft) connectors

    You can extract text or items from a PDF using the PowerApps AI Builder feature.  

    https://powerapps.microsoft.com/en-ca/ai-builder/

     

     

  • takolota1 Profile Picture
    4,974 Moderator on at
    Re: Extract text from PDF without external (non-Microsoft) connectors

    There is a part of this template that converts an image PDF to a text (txt) replica of the file:

    https://powerusers.microsoft.com/t5/Power-Automate-Cookbook/Extract-Data-From-PDFs-and-Images-With-GPT/td-p/2201345

Under review

Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.

Helpful resources

Quick Links

Forum hierarchy changes are complete!

In our never-ending quest to improve we are simplifying the forum hierarchy…

Ajay Kumar Gannamaneni – Community Spotlight

We are honored to recognize Ajay Kumar Gannamaneni as our Community Spotlight for December…

Leaderboard > Power Automate

#1
Michael E. Gernaey Profile Picture

Michael E. Gernaey 462 Super User 2025 Season 2

#2
Tomac Profile Picture

Tomac 456 Moderator

#3
abm abm Profile Picture

abm abm 243 Most Valuable Professional

Last 30 days Overall leaderboard