Extract data from multiple pdf,need guidance

Droopy · December 22, 2021, 10:47am

Hello, for sometime, i am watching and reading about pdf extractions, but i dont seem to find something to help me in my particular case.

Most of the examples given by you, consists in invoices and different form for which there are made templates, use regex, form extractor, ocr and so on.

I have a few pdf files, which at a first look they are similar, but the data i need to extract its not in the same place in all of the files (the files are from 2 to 4 pages).

1.pdf (212.6 KB)
5.pdf (138.6 KB)

My question to you, is there any way or any activity to take in consideration the dotted lines? I need to extract the “Denumire”, “Rezultat” and " UM" columns.

With document understanding i managed to create a template for the first file, but for the second one i needed a different template. Unfortunately, making templates for each file its not worth the time, (too many possibilities).

Any thoughts?

Topic		Replies	Views
PDF Data Extraction with multiple templates Help pdf , studio	5	1832	June 30, 2018
How to extract pdf information when I have multiple pages of PDF? Activities pdf , activities , question	1	1920	November 6, 2020
Text Extraction From PDF - With Layout Retained Activities pdf , activities , question	2	1240	August 18, 2021
Extract certain key words from multiple pdfs Activities pdf , activities , question	8	919	February 8, 2022
Pdf extraction data Documentation pdf , pdf-extraction	9	895	May 22, 2023

Extract data from multiple pdf,need guidance

Related topics