Extract certain text from a string

I have the following string:

href=“/pwc.com/rmstechnology/innovators/markets”
href=“/pwc.com/rmstechnology/blogs”
href=“/pwc.com/rmstechnology/blogs/archive”
href=“/pwc.com/rmstechnology/got-an-idea”
href=“/pwc.com/rmstechnology/home”
href=“/pwc.com/rmstechnology/home”
href=“/pwc.com/rmstechnology/about”
href=“/pwc.com/rmstechnology/learn”

How can I reduce this to just a string of the last word of each line? I.e.

markets
blogs
archive
got-an-idea
home
home
about
learn

Also how can I avoid duplicates in the string?

Hy @Katie_Vooght

Use Strings.Replace(MyString, “href=”/pwc.com/rmstechnology/",vbnullstring)

Assign it to a string variable

Like this? I am getting a message saying errors expected.

image

@Katie_Vooght,

There are two ", that’s what’s causing the error, remove the href part it must work.
Please let me know

Regards

Thanks Will.

The output is <a class=“aJHbb hDrhEe HlqNPb” jsname=“QwLHlb” role=“link” tabindex=“-1” innovators/markets"
<a class=“aJHbb dk90Ob hDrhEe HlqNPb” jsname=“QwLHlb” role=“link” tabindex=“-1” aria-expanded=“false” aria-haspopup=“true” blogs"
<a class=“aJHbb hDrhEe HlqNPb” jsname=“QwLHlb” role=“link” tabindex=“-1” blogs/archive"
<a class=“aJHbb dk90Ob hDrhEe HlqNPb” jsname=“QwLHlb” role=“link” tabindex=“-1” got-an-idea"
<a class=“GAuSPc” jsname=“jIujaf” home"
<a class=“aJHbb dk90Ob jgXgSe HlqNPb” jsname=“QwLHlb” role=“link” tabindex=“0” home"
<a class=“aJHbb dk90Ob jgXgSe HlqNPb” jsname=“QwLHlb” role=“link” tabindex=“-1” about"

Not sure why the beginning bit has appeared?

Hy @Katie_Vooght,

Could you share your entire workflow project so I can take a look?

Thanks!

Sort_String.xaml (12.5 KB)

The text also involves links with are not of the from ‘pwc.com/rmstechnology/learn’. I would like to remove these and also remove any duplicates.

May become too complicated?

ah this is coming because I changed the value in matches for the advanced search.

Could you please help me with duplicates?

Hy @Katie_Vooght,

Are you trying to extract all hyperlinks from a webpage?

I will prepare something for you

Yes all weblinks using matches. Thanks

Hy @Katie_Vooght,

If you want to do so, you can use the Regex: URL

then loop through the variable, please have a look at my workflow.

Regards

Extract_certain_text_from_a_string_Test.zip (27.6 KB)

please let me know if you stil have any questions.

Thanks will already done this however it does not pick up any of the links with href="/pwc.com/rmstechnology/

@Katie_Vooght,

You can try to search this text using the Regex build, then you can do replace the text you dont need as I mentioned in a previous post. Very similar to what I did

Regards

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.