How to validate extracted data without using validation station activity

Hello all,

I am doing a Document understanding project. My objective is to extract data from scanned document and put the extracted data into excel file. Now my requirements are below:

  1. I have a large quantity of files that want to process with Document Understanding But because of the presence of the Validation Station, it seems an impossible job. Is it possible not to use Validation Station?

  2. If I can do the work without use validation station activity then how we can make sure that the extracted data is correct. Is there any way to validate extracted data with input document?

For example I have one scanned document in my project input folder. now I am going to extract data from that scanned document and put the data into excel . now I want to compare the extracted result with scanned document for make sure that my extracted result is correct . How would I do this in uipath . Please share some idea if it is possible.

I will be looking forward to get your valuable ideas.
Thanks in advance.

Hi,

  1. You don’t have to use validation station.
  2. If you want human validation then you can use action center to do the validation in more efficent way. Also you can do condition with the level of confidence of extraction like if above 90% confidence, don’t use validation, if yes, then create a validation to action center.

If you want compare the data extracted vs the document, use validation station or action center.

Hi,

I’m trying to implement this functionality with action center, but when I run a parallel for each on all of my invoices, I run into out of memory exceptions at around 20 invoices. How do you implement this framework in a way that can handle a large volume of invoices?

Someone mentioned that you could do the digitization in a for loop and then use a parallel for each to do the validation actions, but I’m not sure of the proper way to do this. I tried digitizing and classifying in a for loop and created a data table of all of the paths, doc text, doc object models, and classification results then ran a parallel for each on that data table.

This method worked as expected but when I continue the process after validating in action center, I get an error message every time. Any suggestions would be most appreciated. Thanks!