web
You’re offline. This is a read only version of the page.
close
Skip to main content

Announcements

News and Announcements icon
Community site session details

Community site session details

Session Id :
Power Platform Community / Forums / Power Automate / Issue with PDF Invoice...
Power Automate
Suggested Answer

Issue with PDF Invoice Classification Logic in Power Automate Flow

(1) ShareShare
ReportReport
Posted on by 41

Hi all,

I’m building a Power Automate flow to process email attachments and store only valid invoice PDFs into SharePoint.

Current design:

  • Trigger: New email with attachments
  • Loop: Apply to each attachment
  • PDF check
  • Classification using email body text (Html-to-text)
  • Positive: invoice, tax invoice, invoice no, inv
  • Negative: quotation, proforma, credit note, delivery order, etc.
  • SharePoint duplicate check (Get items)
  • Store valid invoices, others routed to ā€œOthersā€ folder
Issue:
  • Classification is inconsistent in real scenarios:
  • Proforma/quotation emails sometimes pass invoice filters due to email text noise (forwarded/replied emails).
  • No validation at PDF/document level (only email body is used).
  • Multiple attachments and mixed email content reduce accuracy.
Question:
  • What is the best practice for reliable invoice vs non-invoice PDF classification in Power Automate?
  • Should this be:
  • email text-based filtering
  • document-level OCR / AI Builder extraction
  • or a hybrid approach?
  • Also, how do you usually separate classification, validation, and SharePoint storage for better reliability and maintainability?
Categories:
I have the same question (0)
  • Suggested answer
    11manish Profile Picture
    2,829 on at
    For reliable invoice processing in Power Automate, avoid relying solely on email body text because forwarded emails, replies, signatures, and mixed attachments can
     
    cause inaccurate classification.
     
    A hybrid approach is recommended:
    1. Validate the attachment (PDF, file size, etc.).
    2. Extract and analyze the PDF content using:
      • AI Builder Invoice Processing
      • AI Builder Document Processing
      • Azure AI Document Intelligence
    3. Classify the document based on invoice-specific fields such as:
      • Invoice Number
      • Supplier Name
      • Invoice Date
      • Total Amount
    4. Validate the extracted data and confidence score.
    5. Check for duplicates using business keys (e.g., Supplier + Invoice Number) rather than filenames.
    6. Store the document in the appropriate SharePoint folder (Processed, Duplicate, Validation Failed, Others, etc.).
    Recommended Flow:
     
    Email Received -> Attachment Validation -> OCR / AI Builder Extraction -> Invoice Classification -> Business Validation -> Duplicate Check -> SharePoint Storage
     
    Best Practice: Use email content only as a secondary indicator. For production solutions, AI Builder Invoice Processing combined with SharePoint metadata validation and duplicate detection provides the highest accuracy and maintainability.
  • Suggested answer
    Valantis Profile Picture
    6,286 on at
     
    Email body classification alone will always be unreliable for the exact reasons you're hitting forwarded threads, signatures, and quoted text create too much noise.

    The recommended approach is hybrid, with document-level classification as the primary signal:

    1. AI Builder Document Classification: train a custom model on your actual invoice PDFs vs non-invoice PDFs. This classifies based on document content, not the email. It's the most reliable approach and handles mixed attachments correctly since each PDF is classified independently. Add this after your PDF check, before the SharePoint step.

    2. Keep email body as a pre-filter only: use it to exclude obvious non-invoices early (e.g. if the subject contains "quotation") but don't rely on it for positive classification.

    3. Structure the flow in three separate scopes: Classification scope (AI Builder call), Validation scope (check required fields exist  invoice number, date, amount), Storage scope (SharePoint write). Separating these makes error handling and debugging much cleaner.

    For multiple attachments: process each attachment independently inside Apply to each, not as a batch. Each PDF gets its own classification result.
    AI Builder document classification requires AI Builder credits but the accuracy improvement over text matching is significant for production use.
     

     

    Best regards,

    Valantis

     

    āœ… If this helped solve your issue, please Accept as Solution so others can find it quickly.

    ā¤ļø If it didn’t fully solve it but was still useful, please click ā€œYesā€ on ā€œWas this reply helpful?ā€ or leave a Like :).

    🏷ļø For follow-ups  @Valantis.

    📝 https://valantisond365.com/

    💼 LinkedIn

    ā–¶ļø YouTube

  • IB-22040114-0 Profile Picture
    41 on at
    Hi , thank you for the advice. Here the flow that I built based on what you suggest. For the AI Builder, I wanted to use from AI Hub in the AI model, is it can, does it require much time?
  • Suggested answer
    Valantis Profile Picture
    6,286 on at
     
    the AI Builder Invoice Processing model is a prebuilt model, meaning it requires no training time at all. It's ready to use immediately from Power Automate without any setup.

    In your flow, add the Process and save invoices or Extract information from invoices action from the AI Builder connector. It handles standard invoice fields (invoice number, date, vendor, amount, line items) out of the box.

    If you want to use it from AI Hub in Power Apps / make.powerapps.com, you can test it there first to see the extracted fields, but for production use in your flow just add it directly as a Power Automate action.

    The only thing you need is sufficient AI Builder credits in your environment. The prebuilt invoice model consumes credits per page processed.
     

     

    Best regards,

    Valantis

     

    āœ… If this helped solve your issue, please Accept as Solution so others can find it quickly.

    ā¤ļø If it didn’t fully solve it but was still useful, please click ā€œYesā€ on ā€œWas this reply helpful?ā€ or leave a Like :).

    🏷ļø For follow-ups  @Valantis.

    📝 https://valantisond365.com/

    💼 LinkedIn

    ā–¶ļø YouTube

  • IB-22040114-0 Profile Picture
    41 on at
    Hi I already asked my supervisor to add the AI Builder Credit and she granted it, but after I add in the power automate, I am having this expression in the action Process Invoice.  Any suggestion on how to fix it?
     

     
                                                                     
  • Suggested answer
    Valantis Profile Picture
    6,286 on at
     
    That JSON object ({"consumptionSource":"PowerAutomate"...}) is the AI Builder billing metadata appearing in the wrong field. It means the AI Builder action input isn't correctly mapped.
     
    The most likely cause: the AI Document file input field is receiving the wrong dynamic content. The Process Invoice action expects the file content as base64 encoded binary, not a metadata object.
     
    Fix:
    1. In the Process Invoice action, click on the AI Document (file) input field
    2. Make sure you're passing the attachment content from your Apply to each loop specifically the attachments body/contentBytes field from the trigger, not the attachment metadata
    3. The expression should be something like: triggerOutputs()?['body/attachments'][0]['contentBytes'] or the dynamic content item()?['contentBytes'] from inside Apply to each
     
    If you're getting the attachment from a Get attachment content action, pass the Body output of that action to the AI Builder file input.
     

    Best regards,

    Valantis

     

    āœ… If this helped solve your issue, please Accept as Solution so others can find it quickly.

    ā¤ļø If it didn’t fully solve it but was still useful, please click ā€œYesā€ on ā€œWas this reply helpful?ā€ or leave a Like :).

    🏷ļø For follow-ups  @Valantis.

    📝 https://valantisond365.com/

    💼 LinkedIn

    ā–¶ļø YouTube

  • IB-22040114-0 Profile Picture
    41 on at
     Hi , the AI credit is not available, so for the current flow design I need to do manually. Is there any suggestion for implement without AI?
  • Suggested answer
    Valantis Profile Picture
    6,286 on at
     
    Without AI Builder, you'll need to rely on rule-based classification. Here's the most reliable approach without AI credits:
     
    1. Extract PDF text content: use the Get file content action to get the PDF, then use the Encodian Convert PDF to text action (if available in your environment) or use the HTML-to-text approach you already have on the email body. This gives you document-level text rather than just the email body.
     
    2. Apply keyword rules on the extracted text with scoring. Instead of a simple contains check, score each keyword hit:
       - Contains 'Invoice No' or 'Invoice #' or 'INV-': +2 points
       - Contains 'Total Amount' or 'Amount Due': +1 point
       - Contains 'Quotation' or 'Proforma' or 'Quote': -3 points
       - Score > 1: classify as invoice
     
    3. Use both attachment name AND content. If the filename contains 'INV' or 'invoice', weight that as an additional signal.
     
    4. For subject line pre-filtering: add a condition before the loop. If the email subject contains 'quotation' or 'proforma', route directly to Others without processing attachments.
     
    This won't be as accurate as AI Builder but is significantly better than email body only because you're reading the actual document content.
     

     

    Best regards,

    Valantis

     

    āœ… If this helped solve your issue, please Accept as Solution so others can find it quickly.

    ā¤ļø If it didn’t fully solve it but was still useful, please click ā€œYesā€ on ā€œWas this reply helpful?ā€ or leave a Like :).

    🏷ļø For follow-ups  @Valantis.

    📝 https://valantisond365.com/

    💼 LinkedIn

    ā–¶ļø YouTube

  • IB-22040114-0 Profile Picture
    41 on at
    Hi , thank for the advice. I already follow as you suggested based on scoring email. 
    Here is a flow design that I currently built:
    Trigger:
    When a new email arrives (V3)

    ↓
    Compose – Subject
    Compose – SenderEmail
    Compose – MessageID
    Compose – HTML Body
    Html to Text (Body Clean)
    Compose – Cleaned Body
    Compose – Sender Domain

    ↓
    Condition – Pre Filter Check
    (IF email is relevant)

    ā”œā”€ā”€ āŒ NO
    │ → Compose: "No Invoice"
    │ → END
    │
    └── āœ… YES (Continue Branch)

    ↓
    Condition – Has Attachment?

    ā”œā”€ā”€ āŒ NO
    │ → Compose: "No Attachment"
    │ → END
    │
    └── āœ… YES

    ↓
    Apply to each (Attachments)

    ↓
    Compose – Check Name
    (toLower(item()?['name']))

    ↓
    Condition – Business Document Check
    (PDF filter only)

    ā”œā”€ā”€ āŒ FALSE
    │ → Compose: "Skipped (Not PDF)"
    │
    └── āœ… TRUE (PDF ONLY)

    ↓
    ────────────────
    📊 SCORING ENGINE
    ────────────────

    Compose – PositiveScore
    Compose – NegativeScore
    Compose – FinalScore

    ↓
    Condition – Score Decision

    ā”œā”€ā”€ āŒ LOW SCORE
    │
    │ ↓
    │ Create File → /Non Invoice/
    │
    │ ↓
    │ Create Item (SharePoint)
    │ Status = Non Invoice
    │ ConfidenceScore = FinalScore
    │
    │
    └── āœ… HIGH SCORE

    ↓
    ───────────────────────
    āœ… BUSINESS VALIDATION
    ───────────────────────

    Compose – UniqueKey

    ↓
    Get Items (SharePoint)
    (Check duplicate)

    ↓
    Condition – Duplicate?

    ā”œā”€ā”€ āœ… YES (Duplicate)
    │
    │ ↓
    │ Create File → /Duplicate/
    │
    │ ↓
    │ Create Item
    │ Status = Duplicate
    │ ConfidenceScore = FinalScore
    │
    │
    └── āŒ NO

    ↓
    Condition – Validation Check

    ā”œā”€ā”€ āŒ FAILED
    │
    │ ↓
    │ Create File → /Validation Failed/
    │
    │ ↓
    │ Create Item
    │ Status = Validation Failed
    │ ConfidenceScore = FinalScore
    │
    │
    └── āœ… PASSED

    ↓
    Create File → /Processed/

    ↓
    Create Item
    Status = Processed āœ…
    DocumentType = Invoice
    ConfidenceScore = FinalScore

    I know that I need to filter from the subject and body email to identify the attachment. But I am having a hard time to identify from the pdf file that usually name for example like NVP0201_422124009.pdf, BWS1.pdf, BLOW WOLL - 260050.pdf
    and PK26008979.pdf, start from the scoring engine. Then when run even the file is skipped. Can you suggest solution? 
  • Suggested answer
    Valantis Profile Picture
    6,286 on at
     
    The flow structure looks good. The issue is that filenames like NVP0201_422124009.pdf and BWS1.pdf contain no obvious invoice keywords, so the filename scoring gives 0 points and the files get skipped or classified as non-invoice.
     
    For the scoring engine, shift the weight away from filename and toward the email subject and body since you don't have PDF text extraction:
     
    For the PositiveScore compose, combine checks:
    - Email subject contains 'invoice', 'inv', 'tax invoice': +3
    - Email body contains 'invoice no', 'amount due', 'total amount': +2
    - Sender domain is a known supplier: +1
    - Filename starts with known prefixes (NVP, BWS, etc. from your suppliers): +1
     
    For the NegativeScore:

    - Subject or body contains 'quotation', 'proforma', 'quote', 'credit note': -3
     
    For the filename specifically, you can build a known prefix list for your suppliers over time. Store them in a SharePoint list and check if the filename starts with any of those prefixes using a Get items call against your prefix list.
     
    The key insight: with these filenames, the email context (subject, sender, body) is more reliable than the filename itself. Your pre-filter condition before the Apply to each loop is actually the most important classification step make sure it's doing the heavy lifting.
     

     

    Best regards,

    Valantis

     

    āœ… If this helped solve your issue, please Accept as Solution so others can find it quickly.

    ā¤ļø If it didn’t fully solve it but was still useful, please click ā€œYesā€ on ā€œWas this reply helpful?ā€ or leave a Like :).

    🏷ļø For follow-ups  @Valantis.

    📝 https://valantisond365.com/

    💼 LinkedIn

    ā–¶ļø YouTube

Under review

Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.

Helpful resources

Quick Links

Season of Sharing Community Challenge Launch!

Jump in, show your community spirit, and win prizes!

Kudos to our 2025 Community Spotlight Honorees

Expanding mentorship, skilling, and AI innovation

Congratulations to the May Top 10 Community Leaders!

These are the community rock stars!

Leaderboard > Power Automate

#1
Valantis Profile Picture

Valantis 411

#2
David_MA Profile Picture

David_MA 300 Super User 2026 Season 1

#3
Vish WR Profile Picture

Vish WR 291

Last 30 days Overall leaderboard