Extract PDF without OCR

i am trying to extract to multiple PDF without using OCR activity.
refer the screenshot of pdf and workflow.
please help me.
thank you.


  1. If your PDF is digitized then you can use uipath.pdf.activities
    image
    and use read pdf text activity.
  2. this will return the text out of pdf.
  3. after that u can use RegEx to extract specific data
  4. if your PDF is scanned document then you have to go for ocr.

Hey @monikanimbalkar, you got a solution or not?

No . i am trying to RegEx activity.

Yah
We can use normal READ PDF ACTIVITY and get the output with a variable of type string and then we can use either Regex or Spilt method to get the string we want
Cheers @monikanimbalkar

1 Like

Which field you want in the PDF … tell me i will help you… if possible Share the PDF

1 Like
  1. Invoice To:
  2. Despatch To
  3. Voucher No
  4. Dated
    5)Description of Goods
    6)QuantityP.o.no-15366 for our Ambavadi SRA-02 Project…pdf (24.8 KB)

Hi @monikanimbalkar,

Run the attached file and refer Output screenshot is given below.

P.o.no-15366 for our Ambavadi SRA-02 Project…pdf (24.8 KB) PDF_Output.txt (1.5 KB) READING PDF.xaml (6.4 KB)
image

1 Like

in Activities Pannel, Type PDF – if its found use Read PDF activity else Click the **Search in available packages** & Install the UiPath.PDF.Activities Package. Please refer the screenshot:

Thanks @Vivek.A.S

  • can you explain me bit about regex that you used in workflow for voucher no,dated and description of goods ?