I am unable to read and extract data from pdf file

Hi @NehaGhodki,

Firstly, Welcome to UiPath community.

As per your current use case, if you are using OCR you might not get 100% accuracy, the results varies and this is due to limitations of OCR.

  1. File is not being read properly row by row and also after reading pdf it is converting some letters into special characters
  • You will get the entire file into a string and then you can split it on the basis of system.environment.newline and store it in an array and then read the array line by line. Here again due to document quality and OCR limitations the execution might not give 100% accurate results so you need to play with scale and different OCR engines and ensure a decent(good) quality pdf to be read.

    2)In Few Fields data is not coming sequentially in text file after reading pdf

    Data may not come sequentially but there will be some pattern which you can identify and then extract the data out of it, for instance, If you want to extract Invoice Number however the Invoice Number is in second line and after that you are getting “Date” in that case you need to first find the index of “Invoice Number” and then extract data between “Invoice Number” and “Date”.
  1. So while extracting data from text file is getting problem, data is not properly extracted in that case.
    Are you using Read text File activity or is it via OCR?
    If Read text file, please gives us a sample file and we will test if there are any issues.
    If via OCR, then point 1 holds the same for this.

Happy Designing!

Regards,
V