Extract table data from PDF to csv


#1

Hi All, I am new to UIPath. I wanted to extract table data from PDF to CSV file .

Attached PDF File for your reference .Please someone help me out on this.

Appreciate your kind help and support.

TURBO COOLING SYSTEM WE MEAN COOLING_1.pdf (132.0 KB)


#2

Hi @Sailaja_Chikkam,

Use screen scarping to get the data.

and Form it into datatable and use write range activity to write the data into CSV.

Refer this link


#3

Through Screen scrapping , i am not getting required output. All the column values are getting extracted into 1 single column


#4

yes @Sailaja_Chikkam, based on your requirement you need to split the data and create the table.

split the string using tab


#5

I did but it is not splitting properly.


#6

I just gave it a try to screen scrape the table inside the PDF file provided, and i see what you mean.

First of all, its not clear if the table are gonna switch in content?
By that i mean, are you gonna screen scrape multiple PDF files with different tables?
(If you are, you need to modify my solution to be more dynamic, this is only to show possibilities in your specific case).

If so i would do something like this. (could be a little tricky at first try)

  1. Open PDF file in Adobe Reader

  2. Use UiPath Explorer to target the table elements one by one

    // This will target Adobe Reader, with a title unknown because we set “*” star in the title attribute.
    <wnd app='acrord32.exe' cls='AcrobatSDIWindow' title='* - Adobe Reader' />

    // Targeting the row we want to get data from (in this case row 1)
    <ctrl idx='1' role='row' />
    // Then the column header 1
    <ctrl role='column header' idx='1' />
    All of these informations are coming from UiPath explorer and can be reused for every “row - column” you want to target and “convert” the result to CSV formatted string.

Use OCR activity “Get OCR Text” and copy the selected item inside the selector property of “Get OCR Text” activity.
Now for every “row - column” you want to read, change the:
// Targeting the row we want to get data from (in this case row 1)
<ctrl idx='1' role='row' />
// Then the column header 2
<ctrl role='column header' idx='2' />

And when its not the header you are reading anymore, change the role to cell instead of column header:
<ctrl role='column header' idx='2' />

to

<ctrl role='cell' idx='5' />

I know that there is a bit of work in this solution, but you will have full control of the result/output you get from the table.
And i think it gives alot of new knowledge if you just started UiPath journey.

The above will result in:
S.N

Let me know if anything is unclear.


#7

Hi @Dev ,

I am able to retrieve single values with the selector . I want to extract entire column data as a string… With this approach i am only able to extract specified cell value .

Thanks for your Support!