Hi All
I'm training my Forms Processing model using 14 samples all with the same format. What I'm noticing is that the OCR is not fully extracting the data. When I run a Quick Test, the areas get selected correctly but the actual data is incorrect.
For example in the pdf a date value is 2020.05.29 but it gets extracted as 20 .05 29. In another example the pdf has the value 2019.11.29 but it gets extracted as 2019.1 29. In a third example we have the value 15452 but it gets extracted as 154 2. Is there anything that I can do to improve the accuracy? The pdf are generated by a third party system so I don't have much control over changing the format.