Hello, for sometime, i am watching and reading about pdf extractions, but i dont seem to find something to help me in my particular case.
Most of the examples given by you, consists in invoices and different form for which there are made templates, use regex, form extractor, ocr and so on.
I have a few pdf files, which at a first look they are similar, but the data i need to extract its not in the same place in all of the files (the files are from 2 to 4 pages).
My question to you, is there any way or any activity to take in consideration the dotted lines? I need to extract the “Denumire”, “Rezultat” and " UM" columns.
With document understanding i managed to create a template for the first file, but for the second one i needed a different template. Unfortunately, making templates for each file its not worth the time, (too many possibilities).