How to get a specific text from Scanned Pdf Document


I want to extract a specific text from document scanned into PDF format.

Let say, If we scan the our Aadhar card and saving into Pdf, How can we get our name from that pdf.


Hi @raju_alakuntla

1.Use Read PDF with OCR activity
2.By using Regex you can get that text

Hi @raju_alakuntla

=> Use the Read Pdf text activity if the document has proper structure.
=> If you have scanned pdf’s then better to use Read Pdf text with OCR activity. This two activities gives the String Output and Store in a Variable.
=> After this use regular expressions to extract the required output from whole data.

Hope it helps!!

can you please provide the Regex



Hi you try with

Read pdf with OCR

and use Regular Expression to extract the text


Okay @raju_alakuntla

Provide the Input data and output fields your required. Then I will code regex for you.

Hope you understand!!

@raju_alakuntla First Step is to read pdf using pdf activites such as read pdf text
You can get these activites once you install the packages

in manage packages->Uipath.pdf.Activities install this and then read pdf text activites

if you have scanned documents use read pdf with ocr activity

once this is done send the output text so that we can provide you the required regex for it

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.