Regex Based Extractor - Table

Hello,

How to extract the table using Regex Based Extractor? I tried below but nothing return

Invoices_eng_OCR_Doc1.pdf (90.4 KB)


Thank you

Did you get a reply to this question?

no one reply on this…

Try resubmitting and I will say I have the same problem and see if we can get soemone from UiPath to respond.

I am interested in exactly the same issue. The Forms based extractor does not appear to work on anything but the simplest tables, so you need to use Regex, but I also cannot get this to work. The simplest Support solution would be to do a short video tutorial on entering the table part of the Regex extractor. At the moment I am solving this problem by dropping out of UiPath reading the pdf into Adobe and exporting it as either an HTML or Excel file. Both work fine. However its a fix rather than an elegant solution.

Yes, i can extract the table use read Full text from PDF, but fail in DU. We would like to use the DU validation as well, hope they can fix the issue.

Hello Leowong,
In this video, I have 17 use-cases for extracting tables from PDF and write data in Excel:

2:00 GitHub free code for all the files
2:20 Logic of general workflow
4:40 File 1 simple PDF
9:50 File 2 PDF with a column with multiple lines
20:10 File 3 PDF with a column with multiple words ON the LAST column
27:00 File 5 PDF with a column with multiple words ON inside column (2 columns)
31:40 File 6 PDF with a column with multiple lines
39:10 File 8 simple PDF
42:15 File 9 PDF with multiple spaces on that need to be correct
45:50 File 10 PDF with multiple columns that have multiple lines + multiple pages
55:50 File 11 simple PDF with protection empty Cells
58:35 File 12 Big PDF with an empty line and Empty columns and partial total
1:02:25 File 13 PDF with multiple columns that have multiple words and hard to define a rule
1:10:15 File 15 PDF with multiple columns that have multiple lines
1:12:50 File 17 simple PDF remove spaces from headers also remove space from Data
1:16:05 File 18 simple PDF
1:17:10 File 19 PDF with multiple pages and columns with multiple lines
1:22:10 File 20 PDF with multiple columns that have multiple lines
1:25:00 File 21 PDF with empty columns and subtotal

Code:

Thanks,
Cristian Negulescu

1 Like