How to scrap data from non-scrapable pdf and then copy it to Google Spreadsheet?


#1

I have two non-scrapable pdfs. It is similar to the PDF below.
TLISC-sample-Invoice.pdf (430.5 KB)

How can scrap the data from the PDF and then paste the data in the corresponding columsn in Google Spreadsheet?

For example: I can scrap

  • Project ID and paste it in Project ID column in the spreadsheet
  • the three items description, Qty, Rate, GST and Amount and paste them in Description, Quantity, Rate, GST and Amount columns in Google Spreadsheet.

I tried to use Read PDF with OCR (Google OCR) and then save it in DataTable. But, it is hard to determine which information should go to which columns.

I tried Data scraping. But, the PDF cannot he scraped. I get the ‘This control does not support data extraction’ error.

Also, does UiPath has Google Spreadsheet activity? If not, how can I connect to Google Spreadsheet.

Thanks
Bonnie


#2

HI,

  • The only way to get text from non-scrap able pdf is to use Read PDF using OCR but using google ocr engine the text scraped will have some errors.

  • So after reading text using Read PDF using OCR change the ocr engine to Microsoft OCR engine and ull get the text without errors.

  • And u have manipulate string to get the text wat u want from there and populate the spreadsheet as u require.

Thanks and Regards


#3

I can’t use Microsoft OCR. But, it throws error.

I am using UiPath 2018.1.1 version.