Question is already answered but here it is:
Use HTTP Request activity with your url as endpoint and you’ll have the page source code as output (String). You can then use a Regex to find the links for example with
pattern = "(?<=\bhref\="")[^""]+?(?="")". A better approach would be to use a html parser (I’m a noob in VB and I don’t know any but in python you can use BeautifulSoup).
In attached workflow, you’ll find a sequence with HTTP Request and Regex and another sequence with FindChildren.
For each case, you’ll find the result as a string a NewLine as separator and as an Array of String. With FindChildren, I filter A elements only but you can edit the selector. The variables’ scope is kept to their respective sequence.
Scrap_Urls.xaml (13.3 KB)