Extract data from html table in email body
Use Case
I have found that there are many posts related to extract data from html table in the email body. This is the common scenario when organization tries to automate the process to increase personal productivity of users. So, I thought of sharing a solution to accomplish this task.
Scenario
Let's take an example of getting information from partners. The users send a template to business partners in the form of html table to respond. Partners replied on the mail with the required information. Using Power Automate cloud flow, we will extract the data or required information from the business and stores those values in the excel sheet saved in SharePoint document library.
Input Source (Email)
We will extract the values for Price/PC, Currency, MOQ & Lead-time (weeks) from the email received from business partners.
Target Excel File
The target master excel file sheet has been saved in SharePoint document library. The excel sheet contains a table which has unique column 'UniqueID'. Based on the unique column, values extracted from email body will be updated in the excel file.
Solution
There are many solutions to extract the data from email like convert html to text and parsing the string, AI builder etc. But here, I will make use of fx expressions to achieve this.
1. Start by creating a new Automated cloud flow.
2. Enter the flow name & select the trigger "When an email arrives (V3)". Press Create button to proceed.
3. Expand trigger action by right click it. When an email arrives provides various filter options to choose from and accordingly the flow will trigger. You can filter based on sender, recipient(s), subject, with or without attachment(s) etc. Here, we will filter the incoming email on subject "RQF Automated" so that the flow will be triggered on a specific email.
4. Add Compose action and write an expression to extract html table from the email body
concat('<table ',first(split(last(split(triggerOutputs()?['body/body'],'<table ')),'</table>')),'</table>')
5. Run the flow once to get to know which tags to target for extracting the data. Depends on the column's formatting, html tags may vary so it recommended to know beforehand. Analyze the output of compose action and note the tags that contains header of data values.
<table width=\"100%\" style=\"box-sizing:border-box; border-collapse:collapse; border-spacing:0px; font-family:SegoeUI,Lato,"Helvetica Neue",Helvetica,Arial,sans-serif; font-size:15px; font-weight:300; background-color:rgb(255,255,255)\"><tbody style=\"box-sizing:border-box\"><tr style=\"box-sizing:border-box; border-bottom:1px solid rgb(234,234,234)\"><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p style=\"box-sizing:border-box; margin:0px\"><strong class=\"x_ContentPasted0\" style=\"box-sizing:border-box; font-weight:bold\">Item Code</strong></p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p style=\"box-sizing:border-box; margin:0px\"><strong class=\"x_ContentPasted0\" style=\"box-sizing:border-box; font-weight:bold\">Supplier</strong></p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p style=\"box-sizing:border-box; margin:0px\"><strong class=\"x_ContentPasted0\" style=\"box-sizing:border-box; font-weight:bold\">QTY</strong></p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p style=\"box-sizing:border-box; margin:0px\"><strong class=\"x_ContentPasted0\" style=\"box-sizing:border-box; font-weight:bold\">UniqueID</strong></p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p style=\"box-sizing:border-box; margin:0px\"><strong class=\"x_ContentPasted0\" style=\"box-sizing:border-box; font-weight:bold\">Price/PC</strong></p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p style=\"box-sizing:border-box; margin:0px\"><strong class=\"x_ContentPasted0\" style=\"box-sizing:border-box; font-weight:bold\">Currency</strong></p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p style=\"box-sizing:border-box; margin:0px\"><strong class=\"x_ContentPasted0\" style=\"box-sizing:border-box; font-weight:bold\">MOQ</strong></p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p style=\"box-sizing:border-box; margin:0px\"><strong class=\"x_ContentPasted0\" style=\"box-sizing:border-box; font-weight:bold\">Lead-time (weeks )</strong></p></td></tr><tr style=\"box-sizing:border-box; border-bottom:1px solid rgb(234,234,234)\"><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">ABCDEF</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">XYZ</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">1000</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">ABCDEFXYZ</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">1</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">USD</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">10</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">2</p></td></tr><tr style=\"box-sizing:border-box; border-bottom:1px solid rgb(234,234,234)\"><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">EFGH</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">XYZ</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">2000</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">EFGHXYZ</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">2</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">USD</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">20</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">4</p></td></tr><tr style=\"box-sizing:border-box; border-bottom:1px solid rgb(234,234,234)\"><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">HIGL</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">XYZ</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">1000</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">HIGLXYZ</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">3</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">USD</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">20</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">6</p></td></tr><tr style=\"box-sizing:border-box; border-bottom:1px solid rgb(234,234,234)\"><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">MNOP</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">XYZ</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">2000</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">MNOPXYZ</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">4</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">USD</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">40</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">8</p></td></tr><tr style=\"box-sizing:border-box; border-bottom:1px solid rgb(234,234,234)\"><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">QRST</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">ZMR</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">5000</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">QRSTZMR</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">5</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">USD</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">50</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">10</p></td></tr><tr style=\"box-sizing:border-box\"><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">UVWZ</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">ZMR</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">5000</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">UVWZZMR</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">6</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">USD</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">60</p></td><td style=\"box-sizing:border-box; min-width:40px; line-height:initial\"><p class=\"x_ContentPasted0\" style=\"box-sizing:border-box; margin:0px\">15</p></td></tr></tbody></table>
Here, we will target the rows with data columns and find the route or path (table > tbody > tr > td > p). Using this path in XPATH function, we will retrieve the data values.
6. Add another compose action, to get the values from html table using expression. In the expression, we will convert the string to xml & then apply XPATH to get values. We also convert into rows and each rows contains corresponding all column values using chunk function.
chunk(xpath(xml(outputs('Compose')),'//table//tbody//tr//td//p//text()'),8)
Pls note that as we have total 8 columns in the html table so we will use 8 in chunk function. The chunk function creates element for each row (including row) and each row contains an array of 8 elements (columns)
7. Next, add "Apply to each" action to iterate each row. As output array of compose 2 also contains header row, so we will use expression to start iterating from second element & skips the header row:
skip(outputs('Compose_2'),1)
Add "Update a row" action inside apply to each block, to update the row in excel based on the unique column.
Expression used for each column:
Column Name |
Expression |
Key Value |
item()?[3] |
Price/PC |
item()?[4] |
Currency |
item()?[5] |
MOQ |
item()?[6] |
Lead-time (weeks ) |
item()?[7] |
As index starts from zero (0), so to get the value of first column (Item Code) you need to use item()?[0]. You could get the value of each row in the iteration using the indexer.
Output
When an email arrives with subject "RFQ Automated", the flow will trigger. It extracts the values from table present in the email body and update the data in the excel file
Conclusion
So, we can now say that using fx expression we can automate the process by extracting the data from the table in the email body. This is the power of expressions in power platform. Using same concept, we could extract the data from any source as we could easily transforms JSON object to XML using expression and further apply filter using XAPTH as per the business requirement.
Comments
-
-
Extract data from html table in email body
If I want to make a flow which will work in any situation. Suppose I make a flow for a table of 8 columns. but suddenly data is not coming(from source) in 8 columns . Suddenly it started to come in 2 columns in this situation also my flow will work. @ManishSolanki your flow is great but if I get 2 cloumned table in my Email this flow will not work.
-
Extract data from html table in email body
Hi @sven7904
You need to create an expression that picks the first table. In this blog, I demonstrated steps for a single html table. You can share me the email html (with sample date) either here or via direct message, I will try creating an expression using that html table and will share the expression with you.
Thank you!
-
Extract data from html table in email body
Hi @ManishSolanki,
I have a similar case, only with two table sections in the email.
When i run this WF like this, I only get the bottom table data. How do I get to the top/first?
Thanks
-
Extract data from html table in email body
Pls check if "Split on" switch is toggled on and output array is selected from the drop down in trigger:
Navigate to (...)->Settings:
-
Extract data from html table in email body
I have an email that comes in that i have triggered. However your concat command is expecting a string and it gives me an error. I pasted what my email looks like below to help give some context.
concat('<table ',first(split(last(split(triggerOutputs()?['body/body'],'<table ')),'</table>')),'</table>')
Here is what I am trying to do
1) My Headers are always the same. 13 columns. SR# and SNOW# also contain a #. That may cause an issue.
2) My Rows are dynamic. They depend on the number of tickets in the email
3) There are two tables in the email. I only want to use the first table.
4) There may be highlighted info on one of the rows depending on the if the Location is a top 500 location so formatting may mess with finding rows and columns. Maybe not
This is the error i get:
InvalidTemplate
Unable to process template language expressions in action 'Concat' inputs at line '0' and column '0': 'The template language function 'split' expects its first parameter to be of type string. The provided value is of type 'Null'. Please see https://aka.ms/logicexpressions#split for usage details.'.Open Severity 1 SRs
The 3 open tickets are sorted by oldest Due Date. Highlighted rows indicate a top 500 location (BK or FS). All resolved severity one ticket information is scheduled to be sent at 10:30pm eastern time daily.
Location
SR#
SNOW#
Due
ETA
Summary
Sub-Status
Technician
Problem_Description
Zone
Team
BK_Rank
FS_Rank
ABC11111
1-11111111
INC11111111
01/23/2024 17:41:55
01/23/2024 14:00:00
Ipsum faucibus vitae aliquet nec ullamcorper sit.
On Site
Hogan, Hulk
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ipsum faucibus vitae aliquet nec ullamcorper sit. Vestibulum lectus mauris ultrices eros in cursus turpis massa tincidunt. Sed augue lacus viverra vitae congue eu consequat ac felis. In nibh mauris cursus mattis molestie a iaculis at erat.
3
4
9036
9721
ABC22222
1-2222222222
INC222222222
01/23/2024 17:59:56
01/23/2024 15:00:00
Blandit turpis cursus in hac habitasse platea dictumst quisque.
On Site
Bond, James
Blandit turpis cursus in hac habitasse platea dictumst quisque. Id ornare arcu odio ut. Mollis aliquam ut porttitor leo a diam. Neque ornare aenean euismod elementum nisi quis eleifend quam. Quam lacus suspendisse faucibus interdum. Ullamcorper velit sed ullamcorper morbi tincidunt ornare.
1
5
93
1578
ABC33333
1-333333333
INC3333333
01/23/2024 19:07:48
Consectetur adipiscing elit pellentesque habitant.
En Route
Mouse, Mickey
Store Router Consectetur adipiscing elit pellentesque habitant. Purus faucibus ornare suspendisse sed. Ut tellus elementum sagittis vitae et. Habitant morbi tristique senectus et netus. Duis ut diam quam nulla. Cras pulvinar mattis nunc sed blandit libero volutpat sed cras.
6
5
7903
8075
Location Details for Severity 1 SRs
The location details below are sorted by Location Name. Highlighted rows a indicate top 500 location (BK or FS).
Location_(w/Zeplin)
Address(w/Map)
Telephone
BK_Rank
FS_Rank
FTS_Contacts
BK_Contacts
FS_Contacts
Related_Site_Info
ABC11111
123 AnyWhere St Any Town, TX 12345
(555) 555-1212
93
1578
PermTech:
Name
Lead:
Name
Sup Mgr:
Name
Lead Manager:
Nametext
text
text
text
text
text
24 Hrs:
# of Racks:
Location(s):ABC22222
123 AnyWhere St Any Town, TX 12345
(555) 555-1212
7903
8075
PermTech:
Name
Lead:
Name
Sup Mgr:
Name
Lead Manager:
Nametext
text
text
text
text
text
24 Hrs:
# of Racks:
Location(s):ABC33333
123 AnyWhere St Any Town, TX 12345
(555) 555-1212
9036
9721
PermTech:
Name
Lead:
Name
Sup Mgr:
Name
Lead Manager:
Nametext
text
text
24 Hrs:
N
# of Racks:
Location(s):This message is automatically generated on an hourly basis from 6am - 10pm eastern time. If you have problems or questions, please email the Group
-
Extract data from html table in email body
Awesome, you find it useful 👍
Pls direct message me the html code, I will give a try. You could share with sample data if it sensitive.
Thanks
-
Extract data from html table in email body
I am new to power automate but your instructions are very helpful. I have it working so far accept the email we recieve has two tables but the compose is grabing from the second and we need to extract only the first table in the email. How can this be done.
-
Extract data from html table in email body
Hi Manish @ManishSolanki ,
Thank you for providing this example on how to build a workflow that extracts html table from the body of an email. Thanks once again for helping me to build my workflow and providing a suitable solution to populate the table in Excel. You are amazing 🙂
Best Regards,
Chukwudi.
-
Extract data from html table in email body
Pls use the below expression in compose action to handle <br> tags:
replace(replace(replace(concat('<table ',first(split(last(split(triggerOutputs()?['body/body'],'<table ')),'</table>')),'</table>'),'<br>',''),' ',''),decodeUriComponent('%0A'),'')
Regarding adding row in excel, I will publish in another blog. But for a time being, you could use "Add rows in a table" action to do the same. For key column, I would suggest choosing the columns from the html table in email body. But you could also set it as GUID field by using guid() as an expression. You could follow the steps shared to @paflo for creating an excel file in one of the replies.
Thanks
*This post is locked for comments