1040 Model Not Returning valueNumber (2015, 2017, 2023)

(1) Share

Report

Posted on by DW-21032103-0

Hello,

I’m working on a Power Automate flow using Azure Document Intelligence (prebuilt US tax / 1040 model) to extract data from IRS Form 1040 PDFs into SharePoint.

I’ve spent 30+ hours troubleshooting this across multiple tax years (2015, 2017, 2023) and am seeing consistent issues with missing numeric values.

Flow Setup:

Upload IRS Form 1040 PDF (clean, digital — not scanned)
Analyze Document (Azure Document Intelligence)
Parse JSON:

json(string(outputs('Analyze_Document_for_Prebuilt_or_Custom_models_(v4.x_API)')?['body']))

Create item in SharePoint using expressions such as:

coalesce(
  body('Parse_JSON')?['analyzeResult']?['documents']?[0]?['fields']?['BoxXX']?['valueNumber'],
  0
)

Issue:

Many expected numeric fields return 0
In Parse JSON:
- Fields are present (e.g., Box7, Box11, Box24, etc.)
- BUT valueNumber is missing
- Only confidence is returned

Example:

"fieldName": "Box11",
"fieldValue": {
  "type": "number",
  "confidence": 0.98
}

Important Notes:

This occurs across multiple tax years (2015, 2017, 2023)
PDFs are high-quality, digitally generated (not scans)
SharePoint columns verified and working
Expressions work correctly when valueNumber is present
Some fields (e.g., wages) occasionally extract correctly, but most do not

Request:

Since I cannot share actual tax return files due to sensitive information:

Is this expected behavior for the 1040 model?
Are only certain fields guaranteed to return valueNumber?
Is there a known limitation with specific tax years or form layouts?
What is the recommended approach when fields are detected but values are not extracted?

Goal:

Reliable extraction of structured financial values from IRS 1040 forms into Power Automate.

Thank you for any guidance.

Categories:

Building flows

I have the same question (0)

All responses (2)

Answers (0)

Sort by

Suggested answer

Haque 3,653 on at

Like
a
(0)

Report
Copy link

Link copied!
Hi @DW-21032103-0,

First thing first is according to Microsoft’s documentation, the tax models use OCR + layout analysis to extract structured fields, but not all fields are guaranteed to produce numeric values — especially across different tax years and IRS form revisions!

Second thing is when the field is recognized structurally (the box is detected on the form), but the OCR/field extraction didn’t capture a usable value.

Third thing is (related to first thing) if the IRS form is scanned or flattened, numeric fields may not be extracted properly.

Fourth is when we analyze document (Azure Document Intelligence), its alwasy good to make sure what it has analyzed! So, running OCR preprocessing (e.g., Azure Cognitive Services OCR) before feeding into Document Intelligence can improve results. Or validating 1040 model with valid schema (please see the finally option below) at least do a safe guard to make sure there is something!

Fifth, before entering into 3rd step (that you explained), if someway we can see what the analysis is done by Azure Document Intelligence will give us an insight. Probably analysis can be viewed either through the Document Intelligence Studio interface or programmatically via the API/SDKs, where it is returned as a structured JSON object. For batch operations, the results are stored in an Azure Blob Storage container.

Initially let's step in two steps:

Step-1: Let's have a fallback to content or valueString, we need to make sure we capture numbers even if they’re returned as strings.

coalesce( body('Parse_JSON')?['analyzeResult']?['documents']?[0]?['fields']?['Box11']?['valueNumber'], body('Parse_JSON')?['analyzeResult']?['documents']?[0]?['fields']?['Box11']?['content'], 0 )

Step-2: Let's store JSON for auditing purpose: it's always better to keep the full JSON output in a SharePoint column (e.g., “RawJSONExtraction”). This ensures we can troubleshoot later when values don’t appear.

FInally, I am not sure if you have/haven't employed the JSON schema for 1040 model validation, if not, I would suggest to do so. In the Parse JSON action, please paste this schema. (you can tune schema for your required fields)

{ "type": "object", "properties": { "analyzeResult": { "type": "object", "properties": { "documents": { "type": "array", "items": { "type": "object", "properties": { "fields": { "type": "object", "properties": { "TaxpayerName": { "type": "object", "properties": { "type": { "type": "string" }, "content": { "type": "string" }, "confidence": { "type": "number" } } }, "SSN": { "type": "object", "properties": { "type": { "type": "string" }, "content": { "type": "string" }, "confidence": { "type": "number" } } }, "Wages": { "type": "object", "properties": { "type": { "type": "string" }, "valueNumber": { "type": "number" }, "content": { "type": "string" }, "confidence": { "type": "number" } } }, "AGI": { "type": "object", "properties": { "type": { "type": "string" }, "valueNumber": { "type": "number" }, "content": { "type": "string" }, "confidence": { "type": "number" } } }, "RefundAmount": { "type": "object", "properties": { "type": { "type": "string" }, "valueNumber": { "type": "number" }, "content": { "type": "string" }, "confidence": { "type": "number" } } } } } } } } } } } }

Feed the output of the Analyze Document (1040 model) into Parse JSON. We will now be able to reference fields directly, e.g.:

body('Parse_JSON')?['analyzeResult']?['documents'][0]?['fields']?['Wages']?['valueNumber']

body('Parse_JSON')?['analyzeResult']?['documents'][0]?['fields']?['TaxpayerName']?['content']

It covers common fields like TaxpayerName, SSN, Wages, AGI, and others. You can expand it as needed depending on which boxes you want to capture.

I am sure some clues I tried to give. If these clues help to resolve the issue brought you by here, please don't forget to check the box Does this answer your question? At the same time, I am pretty sure you have liked the response!

Was this reply helpful? Yes No
Haque 3,653 on at

Like
a
(0)

Report
Copy link

Link copied!

Hi @DW-21032103-0,

I was just following up, did you overcome the issue?

Was this reply helpful? Yes No