Extract datatable from pdf and write in excel

Good morning, I tried to extract a datatable from pdf and write it to excel. Tried several solutions from the forum but did nog work. Most of the times I got 1 cell with all the information. Tried also the PDFtoEcel package but did not work either. Who can help me with this problem? Thanks Michel

Hi @mrusman,
It’s not an easy task ash PDF have different file structure where table is defined completely different than in excel. Additionally PDF can have data not only written as a text but embedded as an image. Having this you need to check what is in your case and then try to work with “hidden” elements. If pdf represents data in table for sure there is some space/delimiter between each “cell”. Use this to divide data and define in process that each line divided by this should go to separate excel cell. Sometimes it required to work with data line by line or even cell by cell instead of grabbing whole data at once and pushing it to the desired document.

Hello Michel,
In this video, I have 17 use-cases for extracting tables from PDF and write data in Excel:

2:00 GitHub free code for all the files
2:20 Logic of general workflow
4:40 File 1 simple PDF
9:50 File 2 PDF with a column with multiple lines
20:10 File 3 PDF with a column with multiple words ON the LAST column
27:00 File 5 PDF with a column with multiple words ON inside column (2 columns)
31:40 File 6 PDF with a column with multiple lines
39:10 File 8 simple PDF
42:15 File 9 PDF with multiple spaces on that need to be correct
45:50 File 10 PDF with multiple columns that have multiple lines + multiple pages
55:50 File 11 simple PDF with protection empty Cells
58:35 File 12 Big PDF with an empty line and Empty columns and partial total
1:02:25 File 13 PDF with multiple columns that have multiple words and hard to define a rule
1:10:15 File 15 PDF with multiple columns that have multiple lines
1:12:50 File 17 simple PDF remove spaces from headers also remove space from Data
1:16:05 File 18 simple PDF
1:17:10 File 19 PDF with multiple pages and columns with multiple lines
1:22:10 File 20 PDF with multiple columns that have multiple lines
1:25:00 File 21 PDF with empty columns and subtotal

Code:

Thanks,
Cristian Negulescu