How to become expert in PDF automation.?

Hi connections.

I’m in a phase of RPA, where do automate the PDF automation. I have to deal with different type of PDFs(pdf, scanned pdf) and some different invoices. My question is that, how to manage the all type of pdfs and get data. When I go through document understanding, AI fabric, google cloud vision OCR or reading pdf and do string manipulation. This methods are not effective working and not dynamic. When I go through some videos in YouTube. They are(pdfs) easy to process. But in reality it doesn’t working.
Can anyone suggest me how to become master in pdf, ocr automation? To manage to get data from any type of pdfs

  1. Get hands-on regex building and string manipulation
  2. More practice of extracting from different pdf templates

Here you have a movie with multiple string manipulation on PDF files


1 Like