Power Platform Community Forum Thread Details

Hello,

I’m working on a Power Automate flow using Azure Document Intelligence (prebuilt US tax / 1040 model) to extract data from IRS Form 1040 PDFs into SharePoint.

I’ve spent 30+ hours troubleshooting this across multiple tax years (2015, 2017, 2023) and am seeing consistent issues with missing numeric values.

Flow Setup:

Upload IRS Form 1040 PDF (clean, digital — not scanned)
Analyze Document (Azure Document Intelligence)
Parse JSON:

json(string(outputs('Analyze_Document_for_Prebuilt_or_Custom_models_(v4.x_API)')?['body']))

Create item in SharePoint using expressions such as:

coalesce(
  body('Parse_JSON')?['analyzeResult']?['documents']?[0]?['fields']?['BoxXX']?['valueNumber'],
  0
)

Issue:

Many expected numeric fields return 0
In Parse JSON:
- Fields are present (e.g., Box7, Box11, Box24, etc.)
- BUT valueNumber is missing
- Only confidence is returned

Example:

"fieldName": "Box11",
"fieldValue": {
  "type": "number",
  "confidence": 0.98
}

Important Notes:

This occurs across multiple tax years (2015, 2017, 2023)
PDFs are high-quality, digitally generated (not scans)
SharePoint columns verified and working
Expressions work correctly when valueNumber is present
Some fields (e.g., wages) occasionally extract correctly, but most do not

Request:

Since I cannot share actual tax return files due to sensitive information:

Is this expected behavior for the 1040 model?
Are only certain fields guaranteed to return valueNumber?
Is there a known limitation with specific tax years or form layouts?
What is the recommended approach when fields are detected but values are not extracted?

Goal:

Reliable extraction of structured financial values from IRS 1040 forms into Power Automate.

Thank you for any guidance.

Categories:

Building flows