Use Matches to obtain strings of text within a paragraph


I would like to pull all triathlon websites from the follow using matches:

href=“/”>RMS Technology

  • <div class=“j10yRb” role=“presentation”

    For example; “/”

    How would I do this? Should I use the advanced option?


can you check if following options would better fit to the task:

  • using find children, filtering to all a elements (Links) and retrieve the href attribute value
  • using XML processing and filtering to all a elements and href attribute value retrival

About Regex A quick an dirty approach could be:

(?<=href=)“.*” seems to start in the right place but then highlights all text after even text that is not needed. Is there a better way to indicate where to stop the text?

as it was doing the most simplest regex pattern yes it take also the surrounding “”. But I dont know your RegEx skills and aimed to do as simple as possible.

However it will not disturb as it can easy removed.

  • use the Matches activity and configure Pattern, input output
  • Afterwards run within an Assign
    left side: String() - Urls
    right side: Matches.Select(Function (m) m.ToString.replace(chr(34).toString,“”).Trim).toArray

and you will get a string array with All Urls

Another approach would be to work with regex Groups and to refer to the Url sourrounded by "

Try this expression:
And from every match returned you will want to get the group 2.

How can I adapt <?href=(["'])(.*?)\1 to ensure it only picks up links with / in? At the moment is also picking up links such as:


maybe this helps:

refer to group 1

if it is not doing as expected, then please your clear described requirements and sample values with us