Unable to read few rows-document understanding

KarthikBallary · February 23, 2021, 7:38am

Hi
I need to read files of PDF and each pdf files has a table(fixed columns and unfixed rows) but it could not fetch some rows correctly even for a single file. I have used form based extractor. Kindly Help.

Thanks in Advance

AndresTarazona · February 23, 2021, 2:07pm

Hi @KarthikBallary

Is the page classified correctly before to use the form extractor?

AndyMenon · February 24, 2021, 2:09am

Switch to an alternate OCR engine during Digitization and set scale to 2. And then redefine your template table columns.

Hope this helps.

KarthikBallary · February 24, 2021, 6:21am

Yes…will attach sceenshot later

KarthikBallary · February 24, 2021, 6:22am

I tried with other OCR. but did not change the scale, let me try this

KarthikBallary · February 26, 2021, 6:38am

Hi Pls find attached screenshot. Let me know if I am wrong.

OCR tried-Testreact, Microsoft, OminiPage

AndresTarazona · February 26, 2021, 12:38pm

Hi @KarthikBallary

It seems you have attached screenshots for the keyword based classifier, did you also configure the extractor?

KarthikBallary · February 26, 2021, 12:48pm

yes attached is keyword based classifier

Topic		Replies	Views
How to accurately divide table rows in Document Understanding Studio document_understanding , document_processing	6	381	June 16, 2023
Extracting tables with varying number of items from pdf using Document Understanding Studio studio , question , document_understanding , activities_panel , table-extraction	9	1872	March 14, 2022
Extract Varying Size PDF Using Document Understanding Action Center uiautomation , studio , question , document_understanding , action_center	2	589	February 2, 2023
Table Extraction image format from pdf Studio studio , question , workflow_analyzer	16	778	March 27, 2023
Read PDF with OCR issue Studio	5	948	May 21, 2020

Most Active Users - Yesterday
Anil_G
ashokkarale
Ajay_Mishra
Gautham_Pattabiraman
BHUSHAN_NAGAONKAR1
vrdabberu
ABHIMANYU_THITE1
lrtetala
samantha_shah
shyamala_shyamu
More details...

Unable to read few rows-document understanding

Related Topics