Hello, I hope this question is simple.
My goal is to convert the pdf’s to text using google OCR. I would like each page to be just 1 long string and also be its own item in an array for scraping purposes.
I have a read PDF using OCR activity. It reads my pdf (multiple pages) and submits it to a variable called strTrial (string variable)
Then I have arrTextFile, a string array of undefined size.
arrTextFile = strTrial.Split(something something something)
The something part is what I am stuck on. I don’t know how to break up a pdf. In theory it’s very simple Each page gets converted to text, then submitted as an item in the array.
I hope I have explained my problem well and anyone can be of help.
Thank you in advance.