Extract pdf table data into an Excel

Hi All,

I am new to UIpath and working on a project where i have to extract table data from pdf to excel. I have tried using Data scrapping and screen scrapping options as well but i could not get the output .

I have gone through similar articles in forum .

But i could not understand the logic.

Can someone help me out to implement the logic in simpler manner .

Attaching the sample PDF for your reference .

sampledata.pdf (110.8 KB)

Much appreciated your kind help and support .

Thanks again .

What you want to extract in this PDF.?

I would like extract table data from the PDF .

Attached screenshot for your reference .

Main.xaml (11.4 KB)
Hi please find the solution here we have 5 array with speed, driver, car, engine and Date and there is a build data table, all you need is loop each array and add data to data table.

Hi Divya,
I want to get the data into CSV in a tabular format.

Currently i tried with excel but i got all the data in single cell. Attached the sheet for your reference.

sample.xls (8.8 KB)

if you create a proper data table you can write it to either excel or to csv that doesnt matter

Hi Sailaja,
Main (1).xaml (17.6 KB)
As per your requirement to store in both excel and csv.

Thanks’
Venkatesh

Thank you . I am able to extract data from the sample PDF attached .

But i am unable to extract from other PDF . Have changed all row columns and variables according to other pdf. Could you please help me out on this?

TURBO COOLING SYSTEM WE MEAN COOLING_1.pdf (132.0 KB)

Attaching the PDF .

Would like to extract this …

I have changed the substring values according to the data in the PDF . I am only able to extract Required content not column wise.

Please help out on this.

Appreciate your kind help and support.

1 Like

How can we Extract Data from a Table in PDF if we are not aware of the number of rows.

1 Like

Hii, i have a pdf which has scanned image in table format
how i extract the data from pdf and store in excel
i tried uisng ocr but the extracted output are store in a single cell without alignment can u suggest the solution if possible
i attached my sample pdf file sample…pdf (433.0 KB)

I have similar requirement and tried your attached solution but it give me “Missing or invalid activity”. please let me what i am missing.

1 Like

Try to update and install all available packages.

1 Like

Thanks it worked after package installation. I need to pick data from insurance certificates which are from different insurance companies in different formats. please help which approach should I go.

Hello,

I have the same trouble, but I am unable to loop through the arrays to get all the information in the different rows. How should I do this? I attached a photo of the pdf as I wasn’t able to upload it.

Thanks in advance!

@Divyashreem, I am unable to extract table data from the attached pdf if I use your work flow. please help…FFF417A5
A quick help is appreciated…

1 Like

Hi,
to be frank i have no clue which workflow you are using for this, Can you please elaborate the steps?

Thank you,
Divya

1 Like

Hi ,
The One you have shared in the previous postsMain (1) (1).xaml (14.9 KB).

We are using The attached PDF in your workflow. but it is not working859654-Mr-Graham-H-Smith.pdf (43.1 KB)

Thankyou
Smaily

1 Like

Read PDF Text→Substring→Split string(output is an array)→Generate Data Table(input: array(0), ColumnSeparators: " ", output: dt)→write range
It’s work for writting first row to excel, use for each, to write all row.

My I know which package you have installed/updated because I am also facing the same problem.