Hello,
I’m working on a Power Automate flow using Azure Document Intelligence (prebuilt US tax / 1040 model) to extract data from IRS Form 1040 PDFs into SharePoint.
I’ve spent 30+ hours troubleshooting this across multiple tax years (2015, 2017, 2023) and am seeing consistent issues with missing numeric values.
Flow Setup:
-
Upload IRS Form 1040 PDF (clean, digital — not scanned)
-
Analyze Document (Azure Document Intelligence)
-
Parse JSON:
json(string(outputs('Analyze_Document_for_Prebuilt_or_Custom_models_(v4.x_API)')?['body']))
-
Create item in SharePoint using expressions such as:
coalesce(
body('Parse_JSON')?['analyzeResult']?['documents']?[0]?['fields']?['BoxXX']?['valueNumber'],
0
)
Issue:
-
Many expected numeric fields return 0
-
In Parse JSON:
-
Fields are present (e.g., Box7, Box11, Box24, etc.)
-
BUT
valueNumberis missing -
Only
confidenceis returned
-
Example:
"fieldName": "Box11",
"fieldValue": {
"type": "number",
"confidence": 0.98
}
Important Notes:
-
This occurs across multiple tax years (2015, 2017, 2023)
-
PDFs are high-quality, digitally generated (not scans)
-
SharePoint columns verified and working
-
Expressions work correctly when
valueNumberis present -
Some fields (e.g., wages) occasionally extract correctly, but most do not
Request:
Since I cannot share actual tax return files due to sensitive information:
-
Is this expected behavior for the 1040 model?
-
Are only certain fields guaranteed to return
valueNumber? -
Is there a known limitation with specific tax years or form layouts?
-
What is the recommended approach when fields are detected but values are not extracted?
Goal:
Reliable extraction of structured financial values from IRS 1040 forms into Power Automate.
Thank you for any guidance.

Report
All responses (
Answers (