Extract from PDF to Excel specifically

Hey, everyone I am having a specific problem, I have one table containing all the data in a PDF file.

I am trying to extract data in rows and columns and trying to save it in Excel but don’t know how to do it. Your help will highly be appreciated.

I am attaching the screenshot of my problem for more clarity.

The below screenshot is demonstrating how I want to save it in Excel.

Is the document digital, can you use ctrl+a, ctrl+c to copy the text?
If so you can use UiPath.PDF.Activities.
Speficically the Read PDF Text activity.

The problem will be formatting it into a nice datatable to write into an excel-document.
You could try with a bunch of regex and pattern matching.

Alternatively you could use some sort of document understanding tool, but that might be overkill in this specific case.

1 Like

first read pdf file with read pdf activity
remove Your Transaction Details with pdftext.replace(“Your Transaction Details”,“”)
then use generate datatable from text activity with column headers in property
then write range activity to write data to excel with resultdata from generate datatable

hope this helps
if you face any issue please attach sample file to provide better approach

1 Like

Hi @Muhammad_Anas_Baloch ,

If the Document is a Digital Document, then using the Combination of PDF activities and regex we should be able to achieve the required output. You can check the below post which should also be similar to your case :

1 Like

You can extract table in pdf to csv using python tabula-py library


Do you think it will work in my case?

1 Like

Yes I can use ctrl+A and ctrl+C in PDF.

yes you just have to cleanse the csv a bit.