Document Understanding in UiPath Studio

Yasir_Yaqoob · March 7, 2025, 5:18am

Hey, I have a question related to Document Understanding. I am extracting a table from a PDF, and the number of rows may vary. I get extra rows in the results. For example, the table has 2 rows, but it returns 7 rows. The other 5 come from different sections of the document. I re-indicated the table, but the result is the same on 3 different documents. Can someone shed light on it?

Plus, which OCR is better for DU? I am using UiPath Document OCR. I was thinking about to use Omnipage OCR. Thanks a lot.

Darshan_Sable · March 7, 2025, 6:55am

@Yasir_Yaqoob

You can use document understanding tab to create ML skill by training multiple documents (at least 10).
Once you train your ML skill it will capture only desired rows. Hope all pdf have same structure and rows can be any number.

Crate table fields and click on predict button it will predict it like below.
In case something indicated wrongly, you can update the fields

Anil_G · March 7, 2025, 7:12am

@Yasir_Yaqoob

Is it having same number of columns and same column names?
train more sample to get accurate date
while annotating make sure you indicate the pages from where you need data only

cheers

Yasir_Yaqoob · March 7, 2025, 7:17am

You are speaking about training first. I am not a training model. But I would like to do. Could you let me know how can I use that trained model in UiPath Studio?

Darshan_Sable · March 7, 2025, 7:28am

@Yasir_Yaqoob Sure

click on create project

Create a document type for example invoices

click open document type

import at least 10 files and indicate the fields

Yasir_Yaqoob · March 7, 2025, 8:00am

I knew that part. Next, If I want to build automation in UiPath Studio, how will I select that trained project?

Darshan_Sable · March 7, 2025, 8:07am

@Yasir_Yaqoob you have to create ML skill for extraction with automated training option

once training is completed use this ML skill in your project with ML extractor

Yasir_Yaqoob · March 7, 2025, 8:33am

OK, thank you. I will give it a try very soon and will mark your answer as solution. Appreciate your time.

system · March 11, 2025, 4:56am

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Extract Unstructured table data from pdf Studio studio , question , activities_panel	6	1956	September 30, 2021
Extract table from PDF - Document Understanding Studio studio , question , activities_panel	5	109	October 19, 2024
Document Understandng Studio studio , question	4	1130	April 15, 2021
Issue in Table data extraction using Document understanding Activities orchestrator , activities , document_understanding	8	1690	May 20, 2022
ML extractor trainer Document Understanding activities , question , document_understanding	2	634	June 22, 2023

Document Understanding in UiPath Studio

Related topics