Extract the table data for multiple pages by using Document Understanding

Hello All,

Can anyone please let me know how to extract the table line data in multiple pages of unstructured document by using document understanding?
I’ve tried for a single page by using Form Extractor and was able to write the extracted data in Excel. But while I’m trying to extract the table data from a document (Consisting of 50+ pages) it’s not extracting and not writing in an excel. Can anyone please help with this?

Hi @sgodi

Try using UiPath Document Understanding out-of-the-box package (Invoices, etc) & use ML extractor to check whether it can extract the data or not. If not, you might have to perform data labeling on the sample documents in order to train the ML package, which can be further used as a custom skill.

Hope this helps,
Best Regards.

Thanks for the response @arjunshenoy, In this case form based or intelligent based extractors are not work for this scenerio?

@sgodi

IFE falls under the category of Form based extraction method and it is an extraction approach best suited for use cases in which non-variable format documents need to be processed, with data extracted from them. In other words, if your documents have little to no variation in the document layout or if you want to mark certain fields as handwritten (signature in some case), then this is a good choice.

Since you are dealing with an unstructured doc format with over 50 pages, it is recommended to go with the ML extraction.

Hope this helps,
Best Regards.

Hi @sgodi

Can you try this-

Once you have configured the extraction of the table data, train the model and publish it to the Orchestrator, then create a new automation project and add the “Intelligent OCR” activity to it, configure the activity to use the published model for the extraction of the table data from the unstructured document, finally use a “For Each Page” loop to iterate over each page in the document and within the loop, use the “Process Document” activity to extract the table data from each page.

Thanks!!