Hi SuperPoweAutomators,
I am currently working on an invoice processing project using AI Builder for my organization. The extraction is supposed to happen as soon as the invoices of .pdf file-types are received via Outlook. I have 5 collections of invoices that I have to train my model on. The objective is to tag specific data from these invoices for extraction and map them into a defined Excel table.
My challenge is:
Some invoice collections have multiple records on one page and each record needs to be mapped to the target Excel sheet as a single record I am having a hard time figuring out how to train my model since only some of these documents come with multi-customer records on a single page.
In the screenshot I have provided here, I have a single-page invoice that has got 3 different customer records (three invoices merged into one invoice per se) and it could even be more (some can be a single page with a single record) which is easier to tag and extract. The red lines in the screenshot indicate the end and beginning of the records. How do I approach this problem and get a solution that works?
The required data on this invoice is:
IN,
OUT,
Nights,
GuestName,
RatePerNight(this will be multiplied by the number of Nights to evaluate the AccommodationRate), and
TotalMeals+OtherCosts
If there is anything that you need clarity on, kindly ask me, and thanks for your assistance in advance.
Hello, how did you solve this problem? I have the same problem. I have invoices that contain products from different POs but each PO number will appear only once at the beginning of each record like yours, so I can not tag the PO number multiple time (for each of its product).
Hi @JShoowa92
By the looks of the screenshots shared it appears you are running into issues using the preview action for GPT Prompts. We have a new action now that is generally available, I would recommend trying to re-create your flow now with the new action and the issues should be fixed.
Please see the documentation here for reference: https://learn.microsoft.com/en-us/ai-builder/create-a-custom-prompt
Best,
Gwenael
Hello,
It's great that you're working on an invoice processing project using AI Builder. Addressing the challenge of extracting data from invoices that contain multiple records on a single page can indeed be complex. Here's a suggested approach to tackle this problem:
Hi, @ARB_wcc
Thank you for your response. I am not well experienced with Power Automate yet. I would like to confirm that I indeed have access to the GPT Prompt Builder connector in my environment and I took a look at the flows you referred me to, however, I cannot seem to successfully extract the data from my invoice like you demonstrated in the AIPrompt.gif above. I believe there is a bug in the new Power Automate designer experience. Kindly refer to the screenshots attached hereto:
Kindly guide me on how you got the extraction done, if possible, step by step. Thank you in advance.
Hi,
Do you have access to the GPT Prompt Builder connector in your environment?
GPT Prompt Builder goes GA and expands to EU,UK,AU - Power Platform Community (microsoft.com)
If so, this task can be automated in a couple of minutes with a custom prompt like the one below:
You are an AI trained to process invoices received as PDF files via email. Your task is to identify and extract specific information from invoices that may contain MULTIPLE CUSTOMER records on a single page. You need to recognize the beginning and end of *each* customer record on the invoice, and extract the following data points:
1. Check-in date (IN)
2. Check-out date (OUT)
3. Number of nights stayed (Nights)
4. Guest name (GuestName)
5. Rate per night (RatePerNight)
6. Total for meals and other costs (TotalMeals+OtherCosts)
For **EACH** customer record, calculate the 'AccommodationRate' by multiplying 'RatePerNight' by the 'Number of nights stayed'. The 'TotalMeals+OtherCosts' is the sum of all other charges excluding the room rate.
Please provide the extracted information in JSON format, with *SEPARATE ENTRIES* for each customer record. If any data point is unclear or missing, indicate it as "Data not available".
You are supposed to handle invoices with a varying number of customer records per page.
Here is the information from a sample invoice for your reference:
[Start of OCR Text]
Wee/ Hotel Erna r na joma Ave. - Invoice to: WALVIS BAY NAMIBIA Tax Invoice no. I Ref. No. AH-Your reference 22_ ._ . .... . our consultant HiIma 27/09/2023 Oty. Booking Details Guest Name Unit Total 1 Dinner re 61866 Abia 195.00 195.00 2 Coke ADM 15.00 30.00 2 Still Water 500m1 Abia 15.00 30.00 1 Dinner re 61862 Abia 195.00 195.00 1 Fruitree Guava Juice Abia 25.00 25.00 1 Standard Room single. Pax: 1 Abla 750.00 750.00 Breakfast IN: 18/09/2023, Out: 19/09/2023. Nights: 1 1 Dinner re 61871 Adelson 195.00 195.00 1 Fruaree Grape juice Adelson 25.00 25.00 1 Lunch re 61859 Aoilson 195.00 195.00 1 Fruitree Guava JuiceAdilson 25.00 25.00 1 Standard Room single. Pax: 1 Adelson 750.00 750.00 Breakfast IN: 18/09/2023. Out: 19/09/2023. Nights 1 1 Lunch re 61860 Jeremiah 0 195.00 195.00 1 Still Water 500m1 Jeremiah 0 15.00 15.00 1 Standard Room single. Pax: 1 Jeremiah 0 750.00 750.00 Breakfast IN: 18/09/2023, Out: 19/09/2023. Nights 1 All given prices in Namibia Dollar Subtotal: 3375.00
[End of OCR Text]
Using the above instructions, extract the relevant information.
Output:
For more info on how you can create this flow, refer to this post from @takolota - Extract Data From PDFs and Images With GPT - Power Platform Community (microsoft.com)
@gbego - New connector tested and approved! 😋
WarrenBelz
146,651
Most Valuable Professional
RandyHayes
76,287
Super User 2024 Season 1
Pstork1
65,999
Most Valuable Professional