How to read pdf file in to the tabular format?


#1

Hi how to read pdf file into the tablular format? is there any activity for that?


#2

Hi @kiran,

Basically, UiPath provides three ways of extracting data from PDF:

  1. Get text activity with anchor base activity
  2. Read PDF Text activity
  3. Read PDF With OCR activity (this option is suggested when it’s not an original PDF file and is the last recommended since it’s prone to errors)

In your case, I would suggest to test those three options and check which one fulfill better your requirements. In my case, I had a project where I had to extract specific elements from the PDF and then I used the Read PDF Text activity to extract the whole text. After that, I split the text into an array of text lines and started to search the text I needed with functions like Substring, IndexOf, Split and so on.

I hope it helps. :slight_smile:


Read PDF Text activity is not working for PDF in Text format
#3

Hello.

Depending on the PDF structure, on 2016.2, the new Data Scraping wizard might be able to directly extract a table if the PDF is native (you can select the text). So you can also test that.


Can't scarp pdf file table data using data scraping?
#4

Thanks acaciomelo, will try this also.


#5

Thanks Nicolae, so with the data scrapping we can get table.Hmm, i think this also good option. thanks again:)


#6

Hi @kiran,

Could you please let me know, what is the data type for reading each line in for each activity?

Regards,
Serran