Issues Extracting URL From Email Body

Hi all,

I am currently attempting to extract a URL to open it in a browser session however the link is backed up by Proofpoint so the link address itself is not the actual raw link.

I have managed to get the link itself but it adds a prefix to it. I need to remove the prefix from that hyperlink so I can open the actual part of the link.

Please refer to my screenshots for a better understanding:

Here’s the link. What is outlined in RED is what I need removed.
image

This because if I have it open that link as is, this is what happens:
Imgur

How can I remove the added prefix so it opens the actual link right away?

Here’s what I’m doing:

thanks again :slight_smile:

@Jeff_Speer,

can you paste email body that contains URL?
and you just want https://urldefense.com/v3/__ from body, Am I right?

The email body is just plain text. It depicts a ServiceNow incident.

The link is what I am interested in:
image

And no, the other way around. I need the rest of the link.

Here is the link I get when I click on the link to actually get to the page I need to:


This is the link that works. This is what I get when I click on the link manually.

However when UIPath opens it I get this:


See how I can’t reach the page because there’s extra text in front of it?
Something happens that when UI Path wants to open the link it adds that extra text before the actual URL is used.

Thanks,

Hi @Jeff_Speer

After extraction of link from the email, u can use replace activitiy to replace the extra text

let’s say link varaible stores the complete url

Then use link.Replace(“https://urldefense.com/v3/_“,””)

To replace the extra text with “”

Try this

Let me know if it works for you

Regards

Nived N :robot:

Happy Automation :relaxed::relaxed::relaxed:

Hi thank you for your reply :slight_smile:

It is not taking the “__” as underscores but is thinking they are spaces.

In link.Replace(“[https://urldefense.com/v3/**__**“,”]

It will not remove the underscores from the link so it ends up using it in Google thinking the url string is a search term (chrome)

I hope you can help me figure out how to make the underscores also go away from the end of urldefense portion :slight_smile:

Thank you!

Nevermind! so I was able to remove the underscores.

However this is how my URL now opens:

I don’t know where the %3Chttps/ comes from?

So I did even further testing. I manually removed the %3Chttps/ just from that beginning part but it’s still not taking me to the link. It seems like if grabbed with UI path it modifies something in the rest of the URL that doesn’t take me to the exact page… but if I physically click the link it will take me straight where I want to go.

Here is the remainder of the link:
123.service-now.com/nav_to.do?uri=sc_task.do3Fsys_id=09635e75874c34907eccebdd3fbb35ee26sysparm_stack=sc_task_list.do*3Fsysparm_query=active=true__;JSUl!!Db5-Glse!KkYrvq0m74VRXX4rR130a8drSUncZUCFBMe_sC46VWPKEn7uXTeLGycmSXqpthtugOIi$>

But it does not work. :frowning:

Here is the real link! The one that works when I click on the actual email link:
123.service-now.com/nav_to.do?uri=%2Fsc_task.do%3Fsys_id%3D09635e75874c34907eccebdd3fbb35ee%26sysparm_stack%3Dsc_task_list.do%3Fsysparm_query%3Dactive%3Dtrue

So there is 100% something else going on. The link from the email is not the same if opened through UI path. Something changes it.

EDIT: sounds like URL encoding issue? Maybe?