Skip to main content

Notifications

Extract data from html table in email body

Use Case

I have found that there are many posts related to extract data from html table in the email body. This is the common scenario when organization tries to automate the process to increase personal productivity of users. So, I thought of sharing a solution to accomplish this task.

 

Scenario

Let's take an example of getting information from partners. The users send a template to business partners in the form of html table to respond. Partners replied on the mail with the required information. Using Power Automate cloud flow, we will extract the data or required information from the business and stores those values in the excel sheet saved in SharePoint document library.

 

Input Source (Email)

ManishSolanki_2-1696763315968.png

We will extract the values for Price/PC, Currency, MOQ & Lead-time (weeks) from the email received from business partners.

 

Target Excel File

The target master excel file sheet has been saved in SharePoint document library. The excel sheet contains a table which has unique column 'UniqueID'. Based on the unique column, values extracted from email body will be updated in the excel file.

ManishSolanki_1-1696763020797.png

 

Solution

There are many solutions to extract the data from email like convert html to text and parsing the string, AI builder etc. But here, I will make use of fx expressions to achieve this.

 

1. Start by creating a new Automated cloud flow.

ManishSolanki_3-1696763546722.png

 

2. Enter the flow name & select the trigger "When an email arrives (V3)". Press Create button to proceed.

ManishSolanki_4-1696763667160.png

 

3. Expand trigger action by right click it. When an email arrives provides various filter options to choose from and accordingly the flow will trigger. You can filter based on sender, recipient(s), subject, with or without attachment(s) etc. Here, we will filter the incoming email on subject "RQF Automated" so that the flow will be triggered on a specific email.

ManishSolanki_5-1696763758234.png

 

4. Add Compose action and write an expression to extract html table from the email body

ManishSolanki_6-1696763874526.png

concat('<table ',first(split(last(split(triggerOutputs()?['body/body'],'<table ')),'</table>')),'</table>')

 

5. Run the flow once to get to know which tags to target for extracting the data. Depends on the column's formatting, html tags may vary so it recommended to know beforehand. Analyze the output of compose action and note the tags that contains header of data values. 

<table width=\"100%\" style=\"box-sizing:border-box; border-collapse:collapse; border-spacing:0px; font-family:SegoeUI,Lato,&quot;Helvetica Neue&quot;,Helvetica,Arial,sans-serif; font-size:15px; font-weight:300; background-color:rgb(255,255,255)\"><tbody style=\"box-sizing:border-box\"><tr style=\"box-sizing:border-box; border-bottom:1px solid rgb(234,234,234)\"><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p style=\"box-sizing:border-box; margin:0px\"><strong class=\"x_ContentPasted0\" style=\"box-sizing:border-box; font-weight:bold\">Item Code</strong></p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p style=\"box-sizing:border-box; margin:0px\"><strong class=\"x_ContentPasted0\" style=\"box-sizing:border-box; font-weight:bold\">Supplier</strong></p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p style=\"box-sizing:border-box; margin:0px\"><strong class=\"x_ContentPasted0\" style=\"box-sizing:border-box; font-weight:bold\">QTY</strong></p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p style=\"box-sizing:border-box; margin:0px\"><strong class=\"x_ContentPasted0\" style=\"box-sizing:border-box; font-weight:bold\">UniqueID</strong></p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p style=\"box-sizing:border-box; margin:0px\"><strong class=\"x_ContentPasted0\" style=\"box-sizing:border-box; font-weight:bold\">Price/PC</strong></p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p style=\"box-sizing:border-box; margin:0px\"><strong class=\"x_ContentPasted0\" style=\"box-sizing:border-box; font-weight:bold\">Currency</strong></p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p style=\"box-sizing:border-box; margin:0px\"><strong class=\"x_ContentPasted0\" style=\"box-sizing:border-box; font-weight:bold\">MOQ</strong></p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p style=\"box-sizing:border-box; margin:0px\"><strong class=\"x_ContentPasted0\" style=\"box-sizing:border-box; font-weight:bold\">Lead-time (weeks )</strong></p></td></tr><tr style=\"box-sizing:border-box; border-bottom:1px solid rgb(234,234,234)\"><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">ABCDEF</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">XYZ</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">1000</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">ABCDEFXYZ</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">1</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">USD</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">10</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">2</p></td></tr><tr style=\"box-sizing:border-box; border-bottom:1px solid rgb(234,234,234)\"><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">EFGH</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">XYZ</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">2000</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">EFGHXYZ</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">2</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">USD</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">20</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">4</p></td></tr><tr style=\"box-sizing:border-box; border-bottom:1px solid rgb(234,234,234)\"><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">HIGL</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">XYZ</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">1000</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">HIGLXYZ</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">3</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">USD</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">20</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">6</p></td></tr><tr style=\"box-sizing:border-box; border-bottom:1px solid rgb(234,234,234)\"><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">MNOP</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">XYZ</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">2000</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">MNOPXYZ</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">4</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">USD</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">40</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">8</p></td></tr><tr style=\"box-sizing:border-box; border-bottom:1px solid rgb(234,234,234)\"><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">QRST</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">ZMR</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">5000</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">QRSTZMR</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">5</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">USD</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">50</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">10</p></td></tr><tr style=\"box-sizing:border-box\"><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">UVWZ</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">ZMR</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">5000</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">UVWZZMR</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">6</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">USD</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">60</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">15</p></td></tr></tbody></table>

Here, we will target the rows with data columns and find the route or path (table > tbody > tr > td > p). Using this path in XPATH function, we will retrieve the data values.

 

6. Add another compose action, to get the values from html table using expression. In the expression, we will convert the string to xml & then apply XPATH to get values. We also convert into rows and each rows contains corresponding all column values using chunk function.

ManishSolanki_7-1696764142848.png

chunk(xpath(xml(outputs('Compose')),'//table//tbody//tr//td//p//text()'),8)
Pls note that as we have total 8 columns in the html table so we will use 8 in chunk function. The chunk function creates element for each row (including row) and each row contains an array of 8 elements (columns)

 

7. Next, add "Apply to each" action to iterate each row. As output array of compose 2 also contains header row, so we will use expression to start iterating from second element & skips the header row:

ManishSolanki_0-1696764363440.png

skip(outputs('Compose_2'),1)

 

Add "Update a row" action inside apply to each block, to update the row in excel based on the unique column.

ManishSolanki_1-1696764495287.png

Expression used for each column:

Column Name

Expression

Key Value
item()?[3]
Price/PC
item()?[4]
Currency
item()?[5]
MOQ
item()?[6]
Lead-time (weeks )
item()?[7]

As index starts from zero (0), so to get the value of first column (Item Code) you need to use item()?[0]. You could get the value of each row in the iteration using the indexer.

 

Output

When an email arrives with subject "RFQ Automated", the flow will trigger. It extracts the values from table present in the email body and update the data in the excel file

ManishSolanki_2-1696764809010.png

 

Conclusion

So, we can now say that using fx expression we can automate the process by extracting the data from the table in the email body. This is the power of expressions in power platform. Using same concept, we could extract the data from any source as we could easily transforms JSON object to XML using expression and further apply filter using XAPTH as per the business requirement.

Comments

*This post is locked for comments

  • Stev916 Profile Picture Stev916
    Posted at
    Extract data from html table in email body

    Excellent article! Very helpful! Kudos

  • Swsr_2000 Profile Picture Swsr_2000 2
    Posted at
    Extract data from html table in email body

    If I want to make a flow which will work in any situation. Suppose I make a flow for a table of 8 columns. but suddenly data is not coming(from source) in 8 columns . Suddenly it started to come in 2 columns in this situation also my flow will work. @ManishSolanki  your flow is great but if I get 2 cloumned table in my Email this flow will not work.

  • ManishSolanki Profile Picture ManishSolanki 15,075
    Posted at
    Extract data from html table in email body

    Hi @sven7904 

     

    You need to create an expression that picks the first table. In this blog, I demonstrated steps for a single html table. You can share me the email html (with sample date) either here or via direct message, I will try creating an expression using that html table and will share the expression with you.

     

    Thank you!

  • sven7904 Profile Picture sven7904
    Posted at
    Extract data from html table in email body

    Hi @ManishSolanki,

     

    I have a similar case, only with two table sections in the email.

    When i run this WF like this, I only get the bottom table data. How do I get to the top/first?

     

    Thanks

  • ManishSolanki Profile Picture ManishSolanki 15,075
    Posted at
    Extract data from html table in email body

    Hi @LoadUpAndRide 

     

    Pls check if "Split on" switch is toggled on and output array is selected from the drop down in trigger:

     

    Navigate to (...)->Settings:

    ManishSolanki_0-1706075143959.pngManishSolanki_1-1706075179951.png

     

  • LoadUpAndRide Profile Picture LoadUpAndRide 47
    Posted at
    Extract data from html table in email body

    @ManishSolanki 

    I have an email that comes in that i have triggered. However your concat command is expecting a string and it gives me an error. I pasted what my email looks like below to help give some context. 

     

    concat('<table ',first(split(last(split(triggerOutputs()?['body/body'],'<table ')),'</table>')),'</table>')

     Here is what I am trying to do

    1) My Headers are always the same. 13 columns. SR# and SNOW# also contain a #. That may cause an issue. 

    2) My Rows are dynamic. They depend on the number of tickets in the email

    3) There are two tables in the email. I only want to use the first table. 

    4) There may be highlighted info on one of the rows depending on the if the Location is a top 500 location so formatting may mess with finding rows and columns. Maybe not

     

     

    This is the error i get: 

    InvalidTemplate

    Unable to process template language expressions in action 'Concat' inputs at line '0' and column '0': 'The template language function 'split' expects its first parameter to be of type string. The provided value is of type 'Null'. Please see https://aka.ms/logicexpressions#split for usage details.'.

     

     

    Open Severity 1 SRs

    The 3 open tickets are sorted by oldest Due Date. Highlighted rows indicate a top 500 location (BK or FS). All resolved severity one ticket information is scheduled to be sent at 10:30pm eastern time daily.

     

    Location

    SR#

    SNOW#

    Due

    ETA

    Summary

    Sub-Status

    Technician

    Problem_Description

    Zone

    Team

    BK_Rank

    FS_Rank

    ABC11111

    1-11111111

    INC11111111

    01/23/2024 17:41:55

    01/23/2024 14:00:00

    Ipsum faucibus vitae aliquet nec ullamcorper sit.

    On Site

    Hogan, Hulk

    Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ipsum faucibus vitae aliquet nec ullamcorper sit. Vestibulum lectus mauris ultrices eros in cursus turpis massa tincidunt. Sed augue lacus viverra vitae congue eu consequat ac felis. In nibh mauris cursus mattis molestie a iaculis at erat.

    3

    4

    9036

    9721

    ABC22222
    1-2222222222
    INC222222222
    01/23/2024 17:59:56
    01/23/2024 15:00:00
    Blandit turpis cursus in hac habitasse platea dictumst quisque.
    On Site
    Bond, James
    Blandit turpis cursus in hac habitasse platea dictumst quisque. Id ornare arcu odio ut. Mollis aliquam ut porttitor leo a diam. Neque ornare aenean euismod elementum nisi quis eleifend quam. Quam lacus suspendisse faucibus interdum. Ullamcorper velit sed ullamcorper morbi tincidunt ornare.
    1
    5
    93
    1578

    ABC33333

    1-333333333

    INC3333333

    01/23/2024 19:07:48

     

    Consectetur adipiscing elit pellentesque habitant.

    En Route

    Mouse, Mickey

    Store Router Consectetur adipiscing elit pellentesque habitant. Purus faucibus ornare suspendisse sed. Ut tellus elementum sagittis vitae et. Habitant morbi tristique senectus et netus. Duis ut diam quam nulla. Cras pulvinar mattis nunc sed blandit libero volutpat sed cras.

    6

    5

    7903

    8075

     

    Location Details for Severity 1 SRs

    The location details below are sorted by Location Name. Highlighted rows a indicate top 500 location (BK or FS).

     

    Location_(w/Zeplin)

    Address(w/Map)

    Telephone

    BK_Rank

    FS_Rank

    FTS_Contacts

    BK_Contacts

    FS_Contacts

    Related_Site_Info

    ABC11111
    123 AnyWhere St Any Town, TX 12345
    (555) 555-1212
    93
    1578
    PermTech:
    Name

    Lead:
    Name

    Sup Mgr:
    Name

    Lead Manager:
    Name
    text 

    email@email.com

    text

    email@email.com

    text

    email@email.com

     

    text 

    email@email.com

    text

    email@email.com

    text

    email@email.com

     

    24 Hrs:


    # of Racks:


    Location(s):

    ABC22222

    123 AnyWhere St Any Town, TX 12345

    (555) 555-1212

    7903

    8075

    PermTech:
    Name

    Lead:
    Name

    Sup Mgr:
    Name

    Lead Manager:
    Name
    text 

    email@email.com

    text

    email@email.com

    text

    email@email.com

     

    text 

    email@email.com

    text

    email@email.com

    text

    email@email.com

     

    24 Hrs:


    # of Racks:


    Location(s):

    ABC33333

    123 AnyWhere St Any Town, TX 12345

    (555) 555-1212

    9036

    9721

    PermTech:
    Name

    Lead:
    Name

    Sup Mgr:
    Name

    Lead Manager:
    Name
    text 

    email@email.com

    text

    email@email.com

    text

    email@email.com

     

     

    24 Hrs:
    N

    # of Racks:


    Location(s):

     

    This message is automatically generated on an hourly basis from 6am - 10pm  eastern time. If you have problems or questions, please email the Group

  • ManishSolanki Profile Picture ManishSolanki 15,075
    Posted at
    Extract data from html table in email body

    Hi @nathan-sobel 

     

    Awesome, you find it useful 👍

     

    Pls direct message me the html code, I will give a try. You could share with sample data if it sensitive.

     

    Thanks

  • NB-13052123-0 Profile Picture NB-13052123-0
    Posted at
    Extract data from html table in email body

    @ManishSolanki 

     

    I am new to power automate but your instructions are very helpful. I have it working so far accept the email we recieve has two tables but the compose is grabing from the second and we need to extract only the first table in the email. How can this be done. 

  • Extract data from html table in email body

    Hi Manish @ManishSolanki ,

    Thank you for providing this example on how to build a workflow that extracts html table from the body of an email. Thanks once again for helping me to build my workflow and providing a suitable solution to populate the table in Excel. You are amazing 🙂 

     

     

    Best Regards,

    Chukwudi.

  • ManishSolanki Profile Picture ManishSolanki 15,075
    Posted at
    Extract data from html table in email body

    Hi @pkn3081_rbm1 

     

    Pls use the below expression in compose action to handle <br> tags:

    replace(replace(replace(concat('<table ',first(split(last(split(triggerOutputs()?['body/body'],'<table ')),'</table>')),'</table>'),'<br>',''),'&nbsp',''),decodeUriComponent('%0A'),'')

     

    Regarding adding row in excel, I will publish in another blog. But for a time being, you could use "Add rows in a table" action to do the same. For key column, I would suggest choosing the columns from the html table in email body. But you could also set it as GUID field by using guid() as an expression. You could follow the steps shared to @paflo for creating an excel file in one of the replies.

     

    Thanks