web
You’re offline. This is a read only version of the page.
close
Skip to main content

Announcements

News and Announcements icon
Community site session details

Community site session details

Session Id :
Power Platform Community / Forums / Power Automate / Concating only Text re...
Power Automate
Answered

Concating only Text results from Recognize text in an image or a PDF document Action

(0) ShareShare
ReportReport
Posted on by 222

Hi.

 

We have a Sharepoint site with 900+ different contracts scanned to PDF files.

We want to use an OCR Action to extract the text to make the contract data searchable.

 

We have set up a Cloud flow to extract the File Content and perform the OCR Action, but I am struggling a bit with the output from the Action.

 

I want to concat all the text from each contract into 1 string that I can then write to a text file WITHOUT using a Loop to append to a variable, but I am not able to find a way to properly reference the the Text property of the returned JSON data.

 

The JSON schema looks like this:

{
 "type": "object",
 "properties": {
 "body": {
 "type": "object",
 "properties": {
 "@@odata.context": {
 "type": "string"
 },
 "responsev2": {
 "type": "object",
 "properties": {
 "@@odata.type": {
 "type": "string"
 },
 "operationStatus": {
 "type": "string"
 },
 "predictionId": {
 "type": "string"
 },
 "predictionOutput": {
 "type": "object",
 "properties": {
 "@@odata.type": {
 "type": "string"
 },
 "results@odata.type": {
 "type": "string"
 },
 "results": {
 "type": "array",
 "items": {
 "type": "object",
 "properties": {
 "@@odata.type": {
 "type": "string"
 },
 "page": {
 "type": "integer"
 },
 "lines@odata.type": {
 "type": "string"
 },
 "lines": {
 "type": "array",
 "items": {
 "type": "object",
 "properties": {
 "@@odata.type": {
 "type": "string"
 },
 "text": {
 "type": "string"
 },
 "boundingBox": {
 "type": "object",
 "properties": {
 "@@odata.type": {
 "type": "string"
 },
 "left": {
 "type": "number"
 },
 "top": {
 "type": "number"
 },
 "width": {
 "type": "number"
 },
 "height": {
 "type": "number"
 },
 "polygon": {
 "type": "object",
 "properties": {
 "@@odata.type": {
 "type": "string"
 },
 "coordinates@odata.type": {
 "type": "string"
 },
 "coordinates": {
 "type": "array",
 "items": {
 "type": "object",
 "properties": {
 "@@odata.type": {
 "type": "string"
 },
 "x": {
 "type": "number"
 },
 "y": {
 "type": "number"
 }
 },
 "required": [
 "@@odata.type",
 "x",
 "y"
 ]
 }
 }
 }
 }
 }
 }
 },
 "required": [
 "@@odata.type",
 "text",
 "boundingBox"
 ]
 }
 }
 },
 "required": [
 "@@odata.type",
 "page",
 "lines@odata.type",
 "lines"
 ]
 }
 }
 }
 }
 }
 }
 }
 }
 }
}
I have tried referencing the data in various formats, but each time I end up getting the error: Array elements can only be selected using an integer index
 
I am able to reference the Line property, which should contain the Text, but after that I am coming up short on trying to concat the text into a single compose.
 
I think we are getting issues because there are several pages in the PDF files instead of 1 page.
 
Any help would be much appreciated!
Categories:
I have the same question (0)
  • Gematria Profile Picture
    222 on at

    These may be PDFs, but the majority are actually pictures in the PDF (Scanned documents basically).

    While Encodians can probably work anyway, we are not interested in using them since we would require additional licenses, we would rather use the licenses we already have for the AI actions.

  • Matthy79 Profile Picture
    4,186 Super User 2024 Season 1 on at

    Hello @Gematria 

     

    You can try to use xpath to extract the data you need.

     

    If you provide a sample JSON and desired output, I will have a look at it.

  • Gematria Profile Picture
    222 on at

    Hi.

    Sorry for slow reply, been a busy week.

     

    Please see attached JSON Example. I have replaced all text with "SomeText" since this is company sensitive data 🙂

  • Verified answer
    Matthy79 Profile Picture
    4,186 Super User 2024 Season 1 on at

    Sorry. I was confused of the information about the JSON schema. This information was not necessary. Because of that I thought you were using a different connector. Since you are using the standard connector this should work:

     

    join(xpath(xml(json(concat('{ "root": { "results": ', outputs('Recognize_text_in_an_image_or_a_PDF_document')?['body/responsev2/predictionOutput/results'], ' } }'))), 'root/results/lines/text/text()'), ' ')

     

    If you didn't rename the ocr action it should work otherwise you have to change the expression to the correct name of the action.

Under review

Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.

Helpful resources

Quick Links

Introducing the 2026 Season 1 community Super Users

Congratulations to our 2026 Super Users!

Kudos to our 2025 Community Spotlight Honorees

Congratulations to our 2025 community superstars!

Leaderboard > Power Automate

#1
David_MA Profile Picture

David_MA 86 Super User 2026 Season 1

#2
Haque Profile Picture

Haque 55

#3
Ellis Karim Profile Picture

Ellis Karim 53 Super User 2026 Season 1

Last 30 days Overall leaderboard