I have a requirement where I have to extract particular fields from multiple pages in a PDF file.I believe this is unstructured Input as the PDF is Scanned.
I tried using the following
- Read PDF with OCR
- Difficult to extract as the Label and Values are separated with multiple lines with different data in between due to Columns in the PDF
- Get Text with OCR (using Anchor)
- I am able to extract fields from 1st page of the PDF (sometimes not, as the font size changes for different scanned PDF’s), but unable to extract from other pages as they are not visible on screen.
- Purchasing ABBYY license is out of scope
would appreciate any suggestions or guidelines
Thanks in Advance!