Cannot extract data from Italian ID Card

gromeo · September 2, 2021, 2:38pm

Hi everybody,
I’m trying to use the form extractor to extract a few fields from an old Italian ID Card (not the plastic one). The scanned document is a 2 pages grayscale 300dpi PDF. I’m using this file (a non redacted version of course): ID-card-redacted.pdf (1.5 MB) both for the template creation and as a document to extract the data from, so in theory the template matches the document 100% since it’s the same file.
I’ve used anchors to define the two fields I want to extract. It should be easy looking at the tutorials and docs.
When running, it recognizes the two pages correctly, but it does not extract the “Nome” field and wrongly extracts the “Cognome”. I don’t understand why. I’ve tried to change the anchors and also to add more than one anchor for each field: in the latter case even the Cognome is not extracted…
Here follows two screenshots from the template manager:

Using those two anchors, here the result from the validation station (the extracted Cognome filed value is surreal…):

Any idea?
Thanks.

NIVED_NAMBIAR · September 2, 2021, 4:41pm

Hi @gromeo
just to know one thing

what was confidence level in extraction of Name ?

gromeo · September 2, 2021, 4:56pm

Field Name (nome) was not extracted. Field surname (cognome) was extracted with a 69% level of confidence but was horribly wrong as per above image.
Giovanni

Parth_Doshi · September 2, 2021, 6:33pm

Hey @gromeo

I would suggest you to use intelligent form extractor that should help you give more better and accurate results
Try machine learning extractor which already has a prebuilt models for ID card
Documentation link: Public Endpoints

The above two solutions should help you solve your problem. Let me know if you face any problem

gromeo · September 3, 2021, 7:26am

Hi,

thanks for your response. I’ll try those solutions, but I don’t understand why the standard form extractor is behaving this way: I’m using the same file for the template and for the document to process, and it doesn’t work… That’s not expected. I’d like to understand whether there a bug in form extractor or I’m doing something wrong.

Thanks,
Giovanni Romeo

gromeo · September 3, 2021, 8:45am

I replaced form extractor with intelligent form extractor, using the very same template (exported from the form extractor template manager and imported in the intelligent form extractor template manager) and the result are exactly the same. No name extracted and horribly wrong surname extracted with 69% confidence. I think there’s something wrong with the template.

gromeo · September 3, 2021, 11:44am

I also tried to create a template without anchors, just plain custom areas (I think that’s how they call the areas defining the values to extract), but in this case I get not values extract (missing both name and surname). I also tried to use a different pdf file (both for template and input document) with front and back of the ID card on the same (first) page, and I also get no extracted data.
Honestly I don’t know what to try next. The OCR process is done correctly (almost all text is recognized)…

gromeo · September 8, 2021, 8:23am

Anyone else? I’m stuck…
Thanks,
Giovanni Romeo

Topic		Replies	Views
Using Form Extractor but shows not extracted in Present Validation Station Document Understanding form-extractor , invoices	5	491	July 7, 2023
Intelligent Form Extractor : not able to extract multiple pages Activities pdf , activities , question	3	905	April 26, 2022
Form extractor not extracting the whole form! please help Studio orchestrator , activities , studio	5	129	November 20, 2023
Document Understanding \| Form Extractor \| Manage Templates - Problem Academy Feedback activities , question , document_understanding	8	2631	October 8, 2020
Document understanding - Unable to extract value for particular text field alone Document Understanding orchestrator , action_center , machine-learning-extractor	9	1785	November 20, 2023

Most Active Users - Yesterday
ashokkarale
MD_Farhan1
Ajay_Mishra
postwick
Dheerendra_vishwakarma
Anil_G
chandreshsinh.jadeja
Gautham_Pattabiraman
vrdabberu
aravindbalineni123
More details...

Cannot extract data from Italian ID Card

Related Topics