How to extract specific data from a PDF file which is not in tabular form just plain text?
Mostly it’s based on labels or patterns…
Can you share a sample?
I don’t have a sample handy, but if you have any PDF samples. May be I can help you with a small POC.
One of the most common took is “Regex”.
Hope it helps!
Try this example …
Put the pdf file into “PDF PATH” Folder to try this example.
In this i had used the OCR method to extract all the plain text data from pdf and using regex to get the specific data from extracted data from OCR.