How to extract data using REGEX

Hi All,

i have a situation like there are multiple scanned pdf files and i want extract some data like customer account, date bill prepared, amount due, due by.

but the challenge is, in some pdf account no is mention as customer account and in some it is mention as Account # like wise.

And also it is mentioned in different way for date bill, amount due and due by.

how to achieve it.

Happy automation.

Hey!

Could you please send us the sample data… With expected output

Will give you the regex

Regards,
NaNi

Hi @THIRU_NANI ,

sorry i can’t share the files with anyone due security purpose.

but i can share few screenshots with you.
image
image
image
image

like wise and also it not at same place in every files.

Hey!

Just give us the sample input… you can modify the numbers…

just we wants to know the pattern to get the result…

Once you read the pdf the text is stored in a string variable right…

Write it to text file and change the data and paste the output string here.

Will give you the regex

Regards,
NaNi

Hey!

Customer Number:

System.Text.RegularExpressions.Regex.Match(strInputVariable,"(?<=Customer Number\n)\d+").ToString

The above expression will give you the Customer Number…

Reference:

System.Text.RegularExpressions.Regex.Match(strInputVariable,"(?<=Account#\s)\d+").ToString

The above expression will give you the Account Number

Reference:

Regards,
NaNi

Hello @Rakesh_Tiwari

You can use regex builder in uipath to create the regex expressions.

Then you can use Matches activity or the regex expression(System.Text.RegularExpressions.Regex.Match(strInputVariable,“(regexexpression”).ToString) .

Hi,

i am getting below as output.
image

used below REGEX to get the Account number as output

Desired output: 123…

Hey!

If you’re trying matches the string. It should be like this:

System.Text.RegularExpressions.Regex.Matches(strInputVariable,"(?<=Account#\s)\d+").Value

If you’re trying to match the string. it should be like this

System.Text.RegularExpressions.Regex.Match(strInputVariable,"(?<=Account#\s)\d+").ToString

Regards,
NaNi

Try to assign that to a variable and print the result in message box. I think you are getting a collection of string and thats why its showing like that.