Regex for scanned pdf

How to create a regex for name and address in scanned pdf?
How to capture particular details in scanned pdf?

1 Like


Are you able to read the data using read pdf or ocr?

If yes then please show the data…else it is difficult to give a regex



For names, a basic regex ([A-Z][a-z]+ [A-Z][a-z]+). This will match typical first and last names, but you may need to adjust it based on your specific use-case and the data in your PDFs.
For addresses, this is more complex since addresses can vary a lot. A simple regex for an American address might look like: (\d+ [A-Za-z0-9]+ St(?:reet)?, [A-Z][a-z]+, [A-Z]{2} \d{5})`.

Hi @satyarohith2020

Read the scanned PDF using Read PDF with OCR activity and write it to a text file.
After that you can write regex expressions for that text.
If possible send the text file and expected output so that we can help you out with regex.

Hope it helps!!

Hey Anil… Thank you for the response, I can read the data to text file using pdf text.

Example: Satya Rohith Nallam
Note: The strings may vary based on names.

Thank you Neha, However i am new to UiPath and i am not understanding this lot. But i will try compare for the task. Thanks alot.

Hey parvathy. thanks much however I cannot share the files due to confidential info.


The ask was to share a sample tect you read…you can replace actual name and show structure…then it can be provided…also let us know which are contains prts in the provided text


Hi @satyarohith2020

If you send the sample text which is not the confidential one but in the same format then we will help you with regular expressions or data manipulations.

Hope it helps!!

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.