During the session I asked a question about using the AI to identify if a document matches a model. The challenge I have been having is to figure out which model to use.
I have found an approach that is working with some degree of accuracy to identify invoices that are arriving in e-mail. Below is a snippet of a flow that monitors a e-mail box, checks if an e-mail contains attachments, if it does if checks if the type is PDF (this is where the code snipped picks up).
For PDF it then uses the AI Builder to OCR and extract the text. At first this appeared to be an odd approach but the results returned are stored by line so simply build an array of the text lines that are extracted. Then I simply have an If statement that looks to see if the array contains "Invoice", "invoice", "INVOICE", etm.
Once I know that the PDF appears to contain an invoice I can then use the AI Builder's prebuilt model called "Extract information from invoices" to build out the rest of the process.
I am early in my testing but I would guess right now that the prebuilt model is at least 80% accurate on the different types I have been using. I do think I will need to consider some logic to identify some invoices that don't fit the model, for example filtering my sender, to push invoices to a custom model.
I thought I would share my recent learning with those to say @dpoggemann present the RPA topic. Thanks to him for providing a much needed push on one of my side projects.
/msdyn_imageblob/$value?size=full)