Select 1st or 2nd match while using regex based extractor and problem in regex expression for address

Hello everyone,
I am working on regex-based extractor to extract data from the scanned document. I have some confusion regarding Regex.

1st, Here I need to extract address “ABC Strasse 27”. But my regex doesn’t give the exact value. It selects “limited”. But when I am trying to add “Limited” in regex expression, it shows the blank output. How can solve this issue?

2nd, how can I make sure that my regex expression will only take 1st or 2nd matches? For example: For Email, I need 2nd matches. How can I select the 2nd match?

Regards,
Ekram

Hi @emshihab

You can use Regex to extract the results.

To use Regex, you need 3 things:

  • Sample/s
  • Expected Output
  • Pattern/Information on what is consistent in the text.

As for Q1 - we need a strong pattern as Addresses can change in size and number of rows.

I have a made a quick regex pattern for Q1. Please note this pattern is not robust and will just ignore the first line and start to match from the second line.

Q.2

I have made a regex pattern for Q2. This will match the email addresses in the body of text. Then we just need to collect the second result as you mentioned.

To do this, insert an Assign Activity.
Left of Assign:

YOURSTRING_VARIABLE

Right of Assign:

System.Text.RegularExpressions.Regex.Matches(INPUT_STRING,“REGEXPATTERN”)(1).ToString

Note the brackets at the end of the above syntax:
(0) = 1st result
(1) = 2nd result

I have a made a quick workflow to assist you.
image

Main.xaml (7.1 KB)

Want to learn more? Take a look here:

Hopefully this helps you :blush:

Cheers

Steve

@Steven_McKeering thank you for your suggestion. I will try it and let you know.

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.