Issue in Table data extraction using Document understanding

Hi Team,

I am reading a PDF, Which has tabular data, I am able to read page one data,

But when i am trying to extract page 2 Tabular data getting an issue, Extracted data is not properly align

Could you please suggest, can I automate this PDF or not fit for RPA

PDF work.docx (401.1 KB)

Hi @JSR_Techno_Talk_s ,

You can try data scraping for structured table extraction.

Thanks

Hi @JSR_Techno_Talk_s,

I believe you are using the OOB UiPath Invoice ML Model for Data Extraction ? As per the documents you share I see that the model is not detecting the lines in the invoice correctly. Hence, You have to perform data labeling (using Data Manager) for a set of invoices and re-train the ML Model (using AI Center). You should then get better results.

Hope that helps !

Regards,
Nithin

Hi Nithin,

Sorry but I am not able to co-related the suggested activity, I am using a form extractor for this PDF.

I am attaching the Code and file if anyone/you can help me with that.

Although I am writing a vb.net code for reading the data from the string (Ref. ReadPDf_as_string), AI and ML are there in Uipath, if they can help me i good to me.

May be some skill gap is there that i want to overcome

Invoice2.pdf (571.4 KB)
PDFReading.zip (264.6 KB)

Hi @JSR_Techno_Talk_s,

The format of the invoice is it always going to be the same ?. UiPath recommends form extractor only for fixed form documents ! (Eg: A Survey Form, Government Forms .etc)

For semi structured documents like Invoice , Purchase Orders you should use Machine Learning Extractor.

Regards,
Nithin

Hi @JSR_Techno_Talk_s,

I also recommend you watch this amazing DU Tutorials Lahiru Fernando.

To learn more about AI Center & Machine Learning Extractor you can watch this video.

Hope that helps !

Regards,
Nithin

yes, it always be the same,

Thanks for guiding me, let me check the tutorial

1 Like

@Nithin_P :

I saw this article, He is using lots of ML extractors.

This invoice for the Australia process order.So I use ML extractors,

But I am not getting all fields in the extractors.

Could please guide me