How to accurately divide table rows in Document Understanding

So I have this pdf table that I’m trying to get thru taxonomy


I managed to get the columns correctly by adding the lines in between but. It seems that it can’t recognized the grey dotted lines in each row, so it messes up the output.

I’ve tried adding the rows this time like this.


But seems the row height are not the same in each pdf(the first row is bigger for this one). It can’t properly read or separate the rows. It would either merged two or more rows, or some rows would be missing. Is there anyway to make it see the lines better?

Hi @Shinjid

Are you using Form extractor for this, it should be able to extract the results though,
I’ll recommend going through the ML model, train ML on some documents and then use that ML Skill to extract the data.

Thanks :smiley:

Yes, I am using form extractor for this. Sorry for asking, is the ML Model and training the same as this vid tutorial online?

I’m trying to follow this one and I managed to get the columns correctly. My online problem is that it reads the rows as one row(like a pivot).

Hi @Shinjid,

This video is having tutorial on Form Extractor and it’s different from ML Skill creation!!

1 Like

Do you have any guide I can follow?

Hi @Shinjid

you can follow this.

1 Like

Anyone knows for another way around this? Sad to say, we don’t have license yet for AI Fabric. I’ve tried Machine Learning Extractor but nothing fits with the endpoints available.