Dynamic text extract using OCR


Does anyone help me with the following scenario:

  1. In a folder I am having set of PDF files eg: count will be 10+ pdf invoice files.

  2. Using directory method after getting count, using any OCR techniques.

  3. My PDF data will vary lets say name will be displayed at the top in one pdf in another pdf it will be displayed at the end.

  4. So, my pdf data will be dynamic for each pdf files.

  5. How can i extract the particular name field across all the pdfs.

Hi @monish06

  1. Strarray strarr=Directory.getfiles(folder path)
    2.foreach item in strarr
    3.use item in read pdf text with ocr and use Microsoft ocr
    4.use generated datatable activity
    5.based on regex you can get field name by using ismatch activity