How to accurately divide table rows in Document Understanding

Shinjid · June 15, 2023, 6:22am

So I have this pdf table that I’m trying to get thru taxonomy

I managed to get the columns correctly by adding the lines in between but. It seems that it can’t recognized the grey dotted lines in each row, so it messes up the output.

I’ve tried adding the rows this time like this.

But seems the row height are not the same in each pdf(the first row is bigger for this one). It can’t properly read or separate the rows. It would either merged two or more rows, or some rows would be missing. Is there anyway to make it see the lines better?

adiijaiin · June 15, 2023, 6:49am

Hi @Shinjid

Are you using Form extractor for this, it should be able to extract the results though,
I’ll recommend going through the ML model, train ML on some documents and then use that ML Skill to extract the data.

Thanks

Shinjid · June 15, 2023, 7:24am

Yes, I am using form extractor for this. Sorry for asking, is the ML Model and training the same as this vid tutorial online?

I’m trying to follow this one and I managed to get the columns correctly. My online problem is that it reads the rows as one row(like a pivot).

adiijaiin · June 15, 2023, 7:41am

Hi @Shinjid,

This video is having tutorial on Form Extractor and it’s different from ML Skill creation!!

Shinjid · June 15, 2023, 7:52am

Do you have any guide I can follow?

adiijaiin · June 15, 2023, 7:53am

Hi @Shinjid

you can follow this.

Shinjid · June 16, 2023, 8:11am

Anyone knows for another way around this? Sad to say, we don’t have license yet for AI Fabric. I’ve tried Machine Learning Extractor but nothing fits with the endpoints available.

Topic		Replies	Views
Unable to read few rows-document understanding Document Understanding studio , feedback	7	724	February 26, 2021
Invoice data extraction using document undertading Document Understanding studio , question , document_understanding , data-extraction , invoices	4	525	June 16, 2023
Help / Expert advice needed: PDF Table extraction (Purchase Order to Excel) Studio studio , question , document_understanding , pdf-extraction , table-extraction , invoices	17	832	October 17, 2023
Is there a way to extract dynamic table from pdf using document understanding? Activities activities , question , document_understanding	0	343	June 16, 2023
Extracting information in a PDF table and relate it with the columns Studio uiautomation , studio , regex , question , document_understanding	2	1285	May 1, 2021

Most Active Users - Yesterday
Yoichi
Gautham_Pattabiraman
Anil_G
lrtetala
ashokkarale
Angel_Meseguer_piqueras
FINNNNNNNN
kardelencihangir
ayumi.ouchi
Gabriele_Radici
More details...

How to accurately divide table rows in Document Understanding

Related Topics