Extract URL from email body using Regex command

I am processing emails and using an Assign activity with the following Regex command to get the URL from inside the email.

System.Text.RegularExpressions.Regex.Match(EmailBodyText,“(?<=<).*(?=>)”).Value

The issue I have is that an email may contain multiple URLs and I need to get all the URLs and then download the files via an HTTP request.

How do I get the system to store all URLs in the EmailBodyText and then download each one to a specific folder keeping the original file name (which isn’t found in the URL).

hi @craig.norton,

Would it be possible if you can share us a sample email body with the URLs that you are trying to extract from? Thanks.

Kind regards,
Kenneth

To add, if you’re trying to get multiple outputs then try using Regex.Matches instead of Regex.Match only.

Thanks for the response. And I should also change the variable type from String to Array of Strings?

Hi Kenneth. Unfortunately, I can’t share the content as it opens sensitive information.

Craig

Yes, so that the expression will return all the matched outputs.

@craig.norton

The output of matches would be an Ienumerable of matches/matchcollection…

You just need to iterate through them to access all and each wpuld be a match type

Cheers

hi @craig.norton,

I have attached a sample code that uses the Matches activity.

Hope this helps.

Kind regards,
Kenneth

P.S. For the Matches activity output, we would need System.Collections.Generic.IEnumerable<System.Text.RegularExpressions.Match> variable.

RegExMatches.xaml (6.9 KB)

Hi @kennbalobalo

I couldn’t open your file and got a lot of warning messages. Are there any particular Dependencies I need to add?

I updated my variable to the type you mentioned but no luck. This is my Assign statement.

System.Text.RegularExpressions.Regex.Matches(EmailBodyText,“(?<=<).*(?=>)”).Value

And this is the error message

image

@craig.norton

System.Text.RegularExpressions.Regex.Matches(EmailBodyText,"(?<=<).*(?=>)")(0).Value this will give the first macthed value…similarly if you change 0 to 1 you will get second and so on…

For accessing all use for loop and change the type argumnet to type System.Text.RegularExpressions.Regex.match

Then inside loop you can use currentitem.value

This should solve the issue

Cheers

Hi @craig.norton

I used a different activity which is the Matches activity. I did not use the Assign activity.

Here’s what’s inside the Matches activity:

Also, make sure you have the System.Text.RegularExpression on the Imports panel.

Hope this helps.

Kind regards,
Kenneth

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.