Power Platform Community Forum Thread Details

I have a flow in Power Automate Cloud flow that runs when a new item is created in a SharePoint library. The flow works and triggers correctly.

My goal is to summarize the content of the uploaded file using the ChatGPT API. However, the file content I receive from SharePoint is in binary format, and the ChatGPT API does not accept binary input.

Is there a way to convert the binary file content (such as PDF or DOCX) into plain text within Power Automate so that it can be sent to the ChatGPT API?

I would prefer a solution that does not require any premium connectors. Has anyone implemented something similar or found a workaround using standard Power Automate actions?

Any guidance or examples would be appreciated.

Categories:

Building flows

Hi,

In Power Automate, when you use Get file content from SharePoint, the file is returned in binary format. Since APIs like the ChatGPT API expect text, the file content needs to be converted to readable text before sending it in the request.

The exact approach usually depends on the file type.

For PDF files, Power Automate does not currently have a standard action that directly extracts text from a PDF without using AI Builder or another premium connector. Because of that, people usually use AI Builder, a third-party connector, or an external service to extract the text.

For DOCX files, it’s a little different. A DOCX file is technically a ZIP package containing XML files, and the actual document text is inside a file called document.xml. In theory the text can be extracted from there, but doing that purely inside Power Automate with standard actions can be difficult.

If you want to stay with standard connectors, one approach is something like this:

Trigger – When a file is created in SharePoint
Get file content
Process or convert the file depending on whether it is PDF or DOCX
Send the extracted text to the ChatGPT API using an HTTP action

Another option, if you control the document format, is to save files as plain text or HTML first. Those formats are much easier to read and send directly to an API.

If premium connectors are available, the process becomes easier. You could use AI Builder – Extract text from documents, or connectors like Encodian or Adobe PDF Services to extract text from PDF or DOCX files. These actions return the text output, which can then be passed directly to the ChatGPT API.

In short, Get file content will give you the file, but converting PDF or DOCX to plain text usually requires AI Builder, a premium connector, or an external service to extract the text before sending it to ChatGPT.

Hope this helps. If this helps resolve your issue, please consider marking the response as Verified so it may help others facing a similar scenario.