Labelling in Document understanding


I want to extract the filed Elements, Minimum, Maximum and actual.
while labelling data it extracts all the data in same column… how to extract as a separate row

@AbarnaKalaiselvam

If its table then you need to indicate the separate columns also…

Do you need element as separate column and all? If ao it might not be extracting like that

Cheers

Hi Abarna,
Please check the following link about labeling Documents in Document Manager.
Link: Document Understanding - Label Documents

Cheers!

1 Like

@Anil_G


I need to extract the elements under the column in different row(one item for a row)
likewise the other.
but it group in to single cell.

@AbarnaKalaiselvam

Thats because you defined element as a column…

You need to define the actual columns…then you would get eqch in different column and element in one row etc

Cheers

Hi, Follow below steps :slight_smile:

To extract data as separate rows from a structured document like the one you’re describing, you can use UiPath Document Understanding. Here’s how to approach the labeling process:

Step 1: Define Your Extraction Schema

  • Open the Document Understanding ML Extractor Trainer.
  • Define your data schema to include the fields you want to extract: Elements, Minimum, Maximum, and Actual.

Step 2: Digitize the Document

  • Use the ‘Digitize Document’ activity to convert the PDF into a machine-readable format.

Step 3: Present Validation Station

  • Use the ‘Present Validation Station’ activity to manually validate and correct the extracted information.
  • During validation, ensure that each value is labeled correctly according to the schema you’ve defined.

Step 4: Labeling for Separate Rows

  • When labeling, if the values are extracted in the same column but need to be separate rows, you should label each value individually.
  • Depending on how the Document Understanding framework processes your document, you might need to adjust the data definition for each field to specify that the values should be captured as separate items.

Step 5: Train Your Model

  • After labeling, train your model with the labeled data. This helps the ML model to learn the correct structure and improve extraction accuracy.

Step 6: Apply Trained Model

  • Once trained, apply the trained ML model to extract data from similar documents using the ‘Machine Learning Extractor’ activity.

Step 7: Data Extraction for Separate Rows

  • Use the ‘Data Extraction Scope’ activity with the trained ML model to extract the data from new documents.
  • The ‘ExtractedDataTable’ variable will hold the extracted data where each row corresponds to one set of Elements, Minimum, Maximum, and Actual.

Step 8: Review and Adjust

  • After extraction, review the results. If the data is not being extracted as separate rows, you may need to go back and adjust the labeling or the schema definition.

@Anil_G ,

yes, I’m defining Elements as column, bcz that is my header.
Elements items are changed, I cant define that as column name.
Is there any solution for my scenario