Convert PDF to Excel

jingwang0222 · November 21, 2018, 12:10am

Hi all

I am working on a secured PDF which are all diagrams, and need to convert it to excel.
Due to it is secured, the only thing i can use is convert pdf to txt, and then capture content I need in the text to excel form.

There is a diagram I cannot convert it,test.pdf (27.4 KB)

this pdf is an example of the diagram due to security issues. I write a word and generate this pdf.

I want it looks like normal format in the excel.
However, it looks like this in the notepad

does any one have any idea how to do this?

Thanks so much!

Jumbo · November 21, 2018, 12:25am

@jingwang0222

Open the pdf with microsoft word first, and copy it to Excel. It might be work.

J,

irahmat · November 21, 2018, 4:34am

Hi,
You have a same problem as me. I used Tabula to solve it. try and let me know

jingwang0222 · November 21, 2018, 5:11am

Hi Jumbo

I convert it to word and try to copy paste the word content to excel, but it didnt work good,

here are the word & xaml, can you have a look? thanks!

test doc.zip (28.1 KB)

530.xaml (30.2 KB)

jingwang0222 · November 21, 2018, 5:17am

Hi
thanks for your advise, but my file is secured and cannot be uploaded.

cheers

Jumbo · November 21, 2018, 6:08am

@jingwang0222

I checked both and I got what you mean, there is many useless blank and line break exist in the word file.
However, I suppose this is owe to PDF format and those data is the same as PDF, right?

In my understanding, this is only way to read PDF file with it’s table format if you cannot use data scraping method in the PDF (except OCR).
So If the format is stable, I recommend you to extract PDF data with this way and delete each space what you don’t need…

Rgds,
J,

jingwang0222 · November 26, 2018, 10:28pm

Thanks for your advice, I also tried that method, but it is too slow and didnt work well as I thought. So i came back to the method that capture data from txt. and i want to use regular expression to do that, which seems achievable for me, if you r still interest, i have asked a new question.

Thanks
Jing

irahmat · November 27, 2018, 2:53am

Hi,

what do you mean upload? I use tabula-java.

Dave_Chandra_US_Tax · August 3, 2019, 7:37pm

Hi Jumbo,

For a project that I have just been assigned, I have a similar issue ==> capture text from images of checks (which will be in “.pdf” format) and write an Excel table via UiPath.

Do you happen to know of a way that I can do the above?

Also, based on your suggestion here, how will UiPath come in to play? I am not seeing that. (I’m curious, that’s all.)

Any tips you can provide would be highly appreciated!

Thank you very much!

Very best,

@Dave_Chandra_US_Tax

Serena · August 4, 2019, 6:51am

Hi,

If you are using Adobe Acrobat DC, this custom package can help you: https://connect.uipath.com/community/project/pua-virtual-acrobat-dc-pdf-activities

You can first convert the scanned pdf into editable pdf (activity “Correct rotation& convert scanned PDF to editable text & images”. Then export the editable pdf to Excel (activity “Export PDF files to other format”).

Jumbo · August 5, 2019, 1:37am

@Dave_Chandra_US_Tax

We have many extra package on package manager and this is one way.

Topic		Replies	Views
PDF file to Excel Help excel , pdf , activities , question	8	1617	December 18, 2019
PDF to Excel conversion in Irregular format Studio studio , question , activities_panel	13	473	August 10, 2023
PDF data to excel in same format Studio excel , uiautomation , activities , studio , question , tools	14	789	October 20, 2023
Converting Pdf table to excel Activities excel , pdf , activities , studio	23	3396	January 18, 2023
Extract Table from PDF to Excel without DU Activities datatable , excel , pdf , activities , string , question	11	2205	August 4, 2021

Most Active Users - Yesterday
ashokkarale
Akash_Javalekar1
KezayAlperen
Yoichi
naveen.s
More details...

Convert PDF to Excel

Related Topics