Hi guys, pls provide solution for this issue in Document understanding

I trying to do Data labelling in AI Center. In some Pdf documents some fields are missing and if i am hiding this fields. Now it is not extracting that fields data available in other documents also. Pls provide solution.tq

@Lokesh_M2

You need not hide…if not present basically it would be blank or no data

cheers

Hi @Lokesh_M2

At the time of Data Labeling, it is recommended to use the documents which mandatorily contain the appropriate field that you will be labeled, as the same labeled data will be sent to the dataset, which further build the ML model.

If the field in not available in any of the labeling document, you can simply leave the field empty for that specific field, but do not mark it as hidden (It will hide the field when exporting to the dataset and thus you are not able extract)

Leave it empty like this:

image

Hope this helps,
Best Regards.

If i leave fields empty while exporting it is giving error.
image

@Lokesh_M2

This is not because you leave the field empty. As the error itself indicates, you need to label at least 10 pages in order to export it to the dataset.

Best Regards.

But for my native docs at least one field is missing.

@Lokesh_M2

You might have to get more data in that case. As per UiPath’s official documentation:

→ For Regular fields, you need at least 20-50 document samples per field. So, if you need to extract 10 regular fields, you need at least 200-500 document samples.

→ For Column fields, you need at least 50-200 document samples per column field, so for 5 column fields, with clean and simple layouts, you might get good results with 300 document samples.

→ Classification fields generally require at least 10-20 document samples from each class.

https://docs.uipath.com/document-understanding/automation-cloud/latest/user-guide/training-high-performing-models

Best Regards.

1 Like

Thank you @arjunshenoy for your valuable info.

1 Like

Thank you @Anil_G

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.