Regex for Arabic data extraction

Hi Team,
I am reading a pdf file with Arabic data and storing it in the pdf_out variable. I need to extract a 10 digit ID number(Arabic) from the pdf_out variable. How to write regex for this requirement?
your support would be greatly appreciated.
thanks in advance.

Hi @amit.chaudhary !

Here is a suggestion, you could try this:

Here is the expression used:

System.Text.RegularExpressions.Regex.Matches( pdf_out,"\d{10}")(0).ToString

Does it get you the right number ?
The variable digits here is a string

1 Like

Hi @amit.chaudhary

Try this method

System.Text.RegularExpressions.Regex.Matches( pdf_out,"[\u0600-\u06ff]{10}")(0).ToString


Nived N :robot:

Happy Automation :relaxed::relaxed::relaxed::relaxed:


Hi @amit.chaudhary

Did it worked out?

1 Like