Extract the table data for multiple pages by using Document Understanding

sgodi · May 1, 2023, 6:08pm

Hello All,

Can anyone please let me know how to extract the table line data in multiple pages of unstructured document by using document understanding?
I’ve tried for a single page by using Form Extractor and was able to write the extracted data in Excel. But while I’m trying to extract the table data from a document (Consisting of 50+ pages) it’s not extracting and not writing in an excel. Can anyone please help with this?

arjunshenoy · May 1, 2023, 6:21pm

Hi @sgodi

Try using UiPath Document Understanding out-of-the-box package (Invoices, etc) & use ML extractor to check whether it can extract the data or not. If not, you might have to perform data labeling on the sample documents in order to train the ML package, which can be further used as a custom skill.

Hope this helps,
Best Regards.

sgodi · May 1, 2023, 7:36pm

Thanks for the response @arjunshenoy, In this case form based or intelligent based extractors are not work for this scenerio?

arjunshenoy · May 2, 2023, 2:09am

@sgodi

IFE falls under the category of Form based extraction method and it is an extraction approach best suited for use cases in which non-variable format documents need to be processed, with data extracted from them. In other words, if your documents have little to no variation in the document layout or if you want to mark certain fields as handwritten (signature in some case), then this is a good choice.

Since you are dealing with an unstructured doc format with over 50 pages, it is recommended to go with the ML extraction.

Hope this helps,
Best Regards.

Nitya1 · May 2, 2023, 5:50am

Hi @sgodi

Can you try this-

Once you have configured the extraction of the table data, train the model and publish it to the Orchestrator, then create a new automation project and add the “Intelligent OCR” activity to it, configure the activity to use the published model for the extraction of the table data from the unstructured document, finally use a “For Each Page” loop to iterate over each page in the document and within the loop, use the “Process Document” activity to extract the table data from each page.

Thanks!!

Topic		Replies	Views
Extract multiple pages line items using Form Extractor Document Understanding	5	2657	May 11, 2021
Intelligent Form Extractor : not able to extract multiple pages Activities pdf , activities , question	4	1068	July 18, 2024
Table extraction from multiple pages in same pdf using intelligent form extractor template Studio studio , question , template	1	1291	March 11, 2022
Hi, all how to extract multiple page PDF data through document understanding, as i'm trying to do but unable to get expected output Studio studio , question , output_panel	12	3104	December 23, 2022
Invoice data extraction using document undertading Document Understanding studio , question , document_understanding , data-extraction , invoices	4	1002	June 16, 2023

Extract the table data for multiple pages by using Document Understanding

Related topics