How to extract PDF Data to CSV or Excel?Please help

i want to extract data fro PDF to Excel or csv ?
tried splitting string to array and put in data table but fails due to empty space in column.

1 Like

@Kunal_Raghav, Hi and welcome to the community

can you provide a sample pdf document that i can try out.

report.pdf (8.5 KB)
Please find the attached sample report

1 Like

Hi @Kunal_Raghav

There are a few ways to do this, the one is using the data scraper capability, the other is using OCR to extract the content.

I generally feel more comfortable tackling more complex problems in Python so if I couldn’t get this right in UiPath, I generally try it in Python and call it through UiPath using the Python activities. This was something we recently tried for another use case. It also works well.

1 Like

Hi Jacqui
i have tried both data scrapping and few OCR to extract data but not able achieve desired result as it doesn’t read space and directly jump to next text field.
It would be more helpful if u can share python script so i try the same as i am less familiar with python.

Hello,

I think the way is to scraping pdf file select ocr but it will not be sufficient.

@Kunal_Raghav

Hey this is really a challenge. so i was trying to convert the pdf file into html first then i could get the table using, tried to follow the instructions below, but it is not working yet. have a look also and try other things until i can get this working.

1 Like

Hey everyone,

This is what I use when reading a table in a pdf to dataframe (or datatable) in Python: https://chezo.uno/blog/2017-01-09_tabula-py--extract-table-from-pdf-into-python-dataframe-6c7acfa5f302/

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.