I have to remove the unnecessary data from variable value

HrefValue={“message”: “<a href="https://app.sc.ge.com/api/filemanager/v2/apps/1656/tenants/95/sysfiles/9587412\” target="_blank" rel="noopener noreferrer" style="color:#fff; text-decoration:none; font-family:Arial,Helvetica,sans-serif; font-size:13px; font-weight:bold" UiPath_custom_id="11">Download",
“level”: “Information”,
“logType”: “User”,
“timeStamp”: “07:25:25”,
“fileName”: “ExportPendingItemsInSupportCentralForPOCreation”,
“processVersion”: “1.2022.8518.25269”,
“jobId”: “840f748c-c45f-40d2-8ea2-0f26ef11e708”,
“robotName”: “hcrpa-dv006-34SV”,
“machineId”: 43,
“organizationUnitId”: 3
}

from the above text i need “https://app.sc.ge.com/api/filemanager/v2/apps/1656/tenants/95/sysfiles/9587412\

Thanks in advance
Likitha

@vinjam_likitha
use regex:

As it is JSON we can do:

grafik
input - the JSON String
output -A JObject - myJObject

Retrieval:

And/Or Regex

Regex which you have shared is not working

can you please recheck

Thanks
Likitha

Hi @vinjam_likitha

Use the regex expressions, check the below regex expression to get the required output.

System.Text.RegularExpressions.Regex.Matches(yourstringinput.ToString,“((?<=\w*=").(?=\”\w\d*))”)

Hope it helps!!

Still i cannot access with the above regular Expression

Thanks
Likitha

Why not Deserialize the XML and then just get the href property? HTML is XML, no?

XHTML is XML - HTML5 e.g is not XML e.g parsing error on <br> tag
We assume the real JSON string is different from the initially shown sample, as it will have to deal with the serialized string of the html and handling the inner "

this could also be rated as an option, but need the JSON Deserialization and message value retrieval in advance. Then:

It may can be a short cut to use Regex. But in general it is to prefer:

  • Process things with technologies which was made for
    • JSON - Newtonsoft / System.Text.JSON
    • XML - XML Api / Linq for XML
    • (X)HTML - XML / html agility pack
1 Like

Thanks for the detailed explanation. I’m trying to do all this to learn, using his original string, and I can’t even get it to work in the Deserialize JSON activity. I keep getting errors like this:

Deserialize JSON: After parsing a value an unexpected character was encountered: t. Path ‘message’, line 1, position 103.

That’s right after the URL. What did you do to his original string to make it work in Deserialize JSON?

Yes as mentioned, we expect that the value of the message is a JSONserialized string

@vinjam_likitha - Try this RegEx:

System.Text.RegularExpressions.Regex.Match(strInput, "https(.*)\\").ToString

ExtractHREF.xaml (6,7 KB)

Hope this helps!

What I’m asking is how you edited their original string to make it work. I’ve tried cleaning it up to make it valid JSON but can’t get it to work.

ah, ok

such sample:
grafik

will look like this within the immediate panel (kindly note there is its own escaping/visualisation which is not 1 to 1 the string content
grafik

For checks we can also do:
grafik

Kindly note at the above sample and the discussion on XML/HTML Processing that the HTML string also is cut off and incomplete:

I hope I got your question now and answered it accordingly

Hey @vinjam_likitha

I know you got enough help already, but if the last \ character is not that important to you, then our new activity that extracts URLs would work here just fine :slight_smile:

It’s the Extract Text activity with the URLs option. It is available starting with System package 23.6.0-preview.

1 Like

No Still i am unable to find the required text

Input:(String Variable)
<a href="https://app.sc.ge.com/api/filemanager/v2/apps/1656/tenants/95/sysfiles/9605803\" target="_blank" rel="noopener noreferrer" style="color:#fff; text-decoration:none; font-family:Arial,Helvetica,sans-serif; font-size:13px; font-weight:bold">Download",

Output: https://app.sc.ge.com/api/filemanager/v2/apps/1656/tenants/95/sysfiles/9605803\

Can you please help in getting regex pattern

Thanks
Likitha

Hi @vinjam_likitha
Can you try this

I hope it helps!!

Hi
still i am not getting

Can you share your .xaml file? It will be a lot easier to help.

@vinjam_likitha
Can you show the screenshot of it

Hi

This Expression worked for me

https(.*?)\

The output of the Matches variable ienumerable

How to extract that value

Thanks
Likitha