Hi,
I’m using Google OCR with “Find OCR Text” to scan Multiple PDF pages in a document. There can be 4000 plus documents.
While scanning i need to loop through the documents which i can do. however some documents may have 4 pages some may have 8 and some may have 16 pages.
Question:
How can i recognize that I’ve reached the end of a single document, the page numbers are not fixed ?
Any idea of using sr (StreamReader) with read to end ?
Below matches gives me page count:
regex.Matches(sr…ReadToEnd(),“/Type\s*/Page[^s]”)
Pattern I’m using is to zoom (send hot key Ctrl+ “+”), this way google ocr is able to scan.
After this i send hot key (pagedn) page down to scan further.
By right a 4 page document will require 8 page downs. but some how it was identified 8 page document may have 19 page down.
How do we handle this and identify if the OCR has reached the end of page of a certain document so that i can move to next document?