How to become expert in PDF automation.?

Hi connections.

I’m in a phase of RPA, where do automate the PDF automation. I have to deal with different type of PDFs(pdf, scanned pdf) and some different invoices. My question is that, how to manage the all type of pdfs and get data. When I go through document understanding, AI fabric, google cloud vision OCR or reading pdf and do string manipulation. This methods are not effective working and not dynamic. When I go through some videos in YouTube. They are(pdfs) easy to process. But in reality it doesn’t working.
Can anyone suggest me how to become master in pdf, ocr automation? To manage to get data from any type of pdfs

  1. Get hands-on regex building and string manipulation
  2. More practice of extracting from different pdf templates

Here you have a movie with multiple string manipulation on PDF files


1 Like

Hello Christian,

I am keen to learn this bit since I have to process many different invoices and this video seems to be ideal for that!

Please advise me where I can find the PDF-files you are using in this video.

Many thanks,