Hi,
I would like to pull all triathlon websites from the follow using matches:
href=“/triathlon.com/rmstechnology/home”>RMS Technology
Hi,
I would like to pull all triathlon websites from the follow using matches:
href=“/triathlon.com/rmstechnology/home”>RMS Technology
can you check if following options would better fit to the task:
About Regex A quick an dirty approach could be:
(?<=href=)“.*” seems to start in the right place but then highlights all text after even text that is not needed. Is there a better way to indicate where to stop the text?
@Katie_Vooght
as it was doing the most simplest regex pattern yes it take also the surrounding “”. But I dont know your RegEx skills and aimed to do as simple as possible.
However it will not disturb as it can easy removed.
and you will get a string array with All Urls
Another approach would be to work with regex Groups and to refer to the Url sourrounded by "
Try this expression:
<a\s+(?:[^>]*?\s+)?href=(["'])(.*?)\1
And from every match returned you will want to get the group 2.
Match.Groups(1).Value
How can I adapt <?href=(["'])(.*?)\1 to ensure it only picks up links with /triathlon.com/rmstechnology in? At the moment is also picking up links such as:
@Katie_Vooght
maybe this helps:
(?<=href=")(\/triathlon\.com\/rmstechnology\/.*)(")
if it is not doing as expected, then please your clear described requirements and sample values with us