How to convert from PDF to XLS?

Hello RPA developers,
I am trying to read and convert the bank statement “.PDF” file into an Excel spread sheet format “.xlsx”. The requirement is to go through the each row in the PDF and extract values and use them for validation purpose.

I appreciate your ideas and inputs in how to extract and read/convert PDF to xlsx/xls/csv in to perfect structured Data table.

Eg: Any bank statement with the list of GL transactions.

1 Like

hi @vijaygrpa
Use read pdf with ocr activity it will extract all the data in a Single string variable
and Use Write range or write cell activity to paste the content in .xlsx file
Need reference check this link Learn Robotic Process Automation with RPA Tutorials for Beginners

Ashwin S

Hi Ashwin, Thanks for the reply. I did try using “read pdf with OCR” its actually working well when targeted at extracting one single page from the PDF. But here in my scenario i would need the transactions extracted from multiple pages lets say(page 4-10) The format is not same , it differs from page to page. Thus this method of extraction has not helped me.

Appreciate your suggestion,I welcome more suggestions please.

Using start process activity open the PDF file and using send hot key copy the content from pdf file.
Using excel application scope paste it into excel file using send hot key.

For those with licenses, Acrobat has exceptionally good XLSX conversion built in.

EDIT: Unfortunately, Adobe expressly forbids using Acrobat for RPA: Adobe Acrobat automation and document workflows

Original Post

  1. Open Acrobat
  2. Click File
  3. Click Export To
  4. Click Spreadsheet
  5. Click Microsoft Excel Workbook

The translation of form layouts to Excel cells is particularly good. Say you have 3 fields laid out like this:

| Name: John Doe            |
+------------+ +------------+
| Age: 112   | |  Happy: Y  |
+------------+ +------------+

Acrobat will correctly put Name on row 1 of the spreadsheet and make it span multiple columns. Age and Happy will be on row 2 and each span 1 column.

Try PicsArt, it’s pretty easy to use and free.

how to use this PicsArt?