Adjacent table extraction from pdf

I have two tables adjacent to each other. One on the left other on the right side on the same page in a pdf.
How can I extract them either as a single DTO or two DTO will also work. When I tried using document understanding and form extracter. It extracts only the left table and cannot extract right table.
Any help would be appreciated.

Hi @SWATI_KAROT

Could you show an example of these tables, or even a sample file (if it doesn’t contain sensitive data)?

Hello,
I need to extract the text highlighted in yellow. the data is coming from 2 tables but should be extracted a one single row.

I suppose you cannot simply extract it as text and then process the output string?
Maybe (hopefully) both tables have different selectors? :crossed_fingers:

Hello Swati,
In this video, I have 17 use-cases for extracting tables from PDF and write data in Excel:

2:00 GitHub free code for all the files
2:20 Logic of general workflow
4:40 File 1 simple PDF
9:50 File 2 PDF with a column with multiple lines
20:10 File 3 PDF with a column with multiple words ON the LAST column
27:00 File 5 PDF with a column with multiple words ON inside column (2 columns)
31:40 File 6 PDF with a column with multiple lines
39:10 File 8 simple PDF
42:15 File 9 PDF with multiple spaces on that need to be correct
45:50 File 10 PDF with multiple columns that have multiple lines + multiple pages
55:50 File 11 simple PDF with protection empty Cells
58:35 File 12 Big PDF with an empty line and Empty columns and partial total
1:02:25 File 13 PDF with multiple columns that have multiple words and hard to define a rule
1:10:15 File 15 PDF with multiple columns that have multiple lines
1:12:50 File 17 simple PDF remove spaces from headers also remove space from Data
1:16:05 File 18 simple PDF
1:17:10 File 19 PDF with multiple pages and columns with multiple lines
1:22:10 File 20 PDF with multiple columns that have multiple lines
1:25:00 File 21 PDF with empty columns and subtotal

Code:

Thanks,
Cristian Negulescu