How to extract tabular data from an invoice with uipath activity

JaneDoe_01092020_130792.pdf (17.6 KB)
this kind of pdf data table i want to extract and i have 4-7 pdf’s i dont want to use document understanding .
Regards ,

Hi @Raina_Ocean_Sanjay

Extracting tables from PDFs is a tricky thing. It depends upon the structure of the table.
To extract tables we have to use String manipulation (regex and others)

Read the Text of the Pdf file using the Read PDF Text Activity (keep Preserver Formatting option as True for tables)

The output of the Read PDF Text activity gives a string variable.

Now, use the string manipulation techniques to convert the string, into array of string where each array contains the row and further split the row to get the cell values.
Please go through the xaml file attached. This will give you some direction and you can build upon it your solution, customizing according to your requirements

REGEX_ExtractTableFromPDF.xaml (7.9 KB)

Hi @Raina_Ocean_Sanjay ,

As an alternate, we can use Regex/String Manipulation along with the Generate Datatable Activity to get the Desired result for the PDF sample that you have provided.

However, It might not be the same case for the other PDF’s that you have, as we do not know if the same pattern and same format’s are being followed.

Also, We could try to use Directly the Generate Datatable Activity with the Data received from Read PDF Text Activity (PreserveFormat = True) but we would then have to remove the irrelevant details later on from the produced datatable.

Check the below workflow sample :
Extract_Table_PDF.zip (12.6 KB)

Let us know if it does not work for the other sample data that you have, and provide us with the similar data / let us know the differences to help you further with this case.

Sample Invoice 1.pdf (31.1 KB)
Sample Invoice 2.pdf (22.2 KB)
Sample Invoice 3.pdf (31.2 KB)
i want to extract table from these invoices and i there any other method other then regex as i dont have to use that in my process .

other then using regex is there any other method and ya all of my invoices are in same format
These are the invoices for your reference
Sample Invoice 1.pdf (31.1 KB)
Sample Invoice 3.pdf (31.2 KB)
Sample Invoice 2.pdf (22.2 KB)