Hi,
In Power Automate, when you use Get file content from SharePoint, the file is returned in binary format. Since APIs like the ChatGPT API expect text, the file content needs to be converted to readable text before sending it in the request.
The exact approach usually depends on the file type.
For PDF files, Power Automate does not currently have a standard action that directly extracts text from a PDF without using AI Builder or another premium connector. Because of that, people usually use AI Builder, a third-party connector, or an external service to extract the text.
For DOCX files, it’s a little different. A DOCX file is technically a ZIP package containing XML files, and the actual document text is inside a file called document.xml. In theory the text can be extracted from there, but doing that purely inside Power Automate with standard actions can be difficult.
If you want to stay with standard connectors, one approach is something like this:
-
Trigger – When a file is created in SharePoint
-
Get file content
-
Process or convert the file depending on whether it is PDF or DOCX
-
Send the extracted text to the ChatGPT API using an HTTP action
Another option, if you control the document format, is to save files as plain text or HTML first. Those formats are much easier to read and send directly to an API.
If premium connectors are available, the process becomes easier. You could use AI Builder – Extract text from documents, or connectors like Encodian or Adobe PDF Services to extract text from PDF or DOCX files. These actions return the text output, which can then be passed directly to the ChatGPT API.
In short, Get file content will give you the file, but converting PDF or DOCX to plain text usually requires AI Builder, a premium connector, or an external service to extract the text before sending it to ChatGPT.
Hope this helps. If this helps resolve your issue, please consider marking the response as Verified so it may help others facing a similar scenario.