here i got an issue ,this is validation station, and i am trying to extract CIN number but its different in PDF and in validation station.
what kind of pdf it is or why its showing hidden extra text.
Hi @ankur.kaushik ,
Have you tried extracting data from PDF with Regex?
@ankur.kaushik I guess its not extracting hidden data. Reasons could be
- Matching the other/similar data present in the same page or in the different page
- Due to the pdf less quality, its reading the data in the incorrect format
Question - Which extractor you are using ?
-
If you were using Intelligent Form Extractor or Form Extractor, why don’t you try with Anchors
-
If you were using ML extractor, more retraining required to extract the data properly
its scanned image pdf
form extractor
okay i will try it.
Try Using Intelligent Form Extractor or Machine Learning Extractor and check the results.
Thanks.
This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.