Regex to remove ID tags

Input: "<meta id="conf-con-path" name="conf-con-path" content="/wiki2">"

Mission: I want all id attributes to be remove from the input string (we can have more than one). Therefore the result needs to be: "<meta name="conf-con-path" content="/wiki2">"

Can you guys help?

Hi,

Can you try the following expression?

System.Text.RegularExpressions.Regex.Replace(text,"id=""[^""]*""",String.Empty)

Regards,

3 Likes

@LauraMM
As an alternate with an ungreedy pattern for using in Regex.Replace
grafik

dont missout to double quote the " within the pattern string

3 Likes

Cool. But is it possible to do directly in regex without having to use a replace?

@LauraMM
have a check if handling it over groups will match to your needs:
(.*)(id="(.)*?")(.*)

grafik
grafik

Cool. But the regex pattern still give me the full match with the id. I want the full match to be everything else than the id attribute.

@LauraMM
as mentioned you would work with groups (see the highlights on the screenshot above). Also have a look here:

1 Like

Ok. But that doesn’t help me further, since I then have to use .groups(x), which I’m trying to avoid. Is it possible to create a regex pattern for my mission? The thread doesn’t help on that either.

ok, so your requirements

  • not using regex.replace
  • not using groups

any other requirements / not to use items?

Exactly.

What I want is to use regex101 and then get a full match result of "<meta name="conf-con-path" content="/wiki2">"

Is that not possible?

@Steven_McKeering to the rescue! Is it possible to do in Regex without anyother coding in UiPath, meaning can we exclude the id-tag with regex code alone?

Hello @LauraMM

@yoichi and @ppr are REAL gurus (the real rescue crew) and are some of the most knowledgeable people on forums. I listen to them. Their advice trumps mine :blush:

What you are asking is not possible in 1 step.

Simplest option: You will need to use a replace activity. It’s really not a bad option - it’s easy. Keeping it simple is the goal always!

Use this pattern((?=id).*(?=name))in a Replace activity. Replacement pattern is “”. It will remove the highlighted element from the pattern.

Otherwise you could do it in two steps and join them. (More work).

Otherwise you’ll need to use groups.
See Section 4 of my mega post for information on groups if you want to use groups.

Why don’t you want to use groups or replace methods?

It’s always important to provide AS MUCH information on your text also this will give the best Regex pattern.

Thank you the three of you @ppr @Yoichi @Steven_McKeering

Just to understand you right. It will not be possible to do a match in Regex for everything else than the id tag? Or do you not know to do it and therefore suggest me to do a replace/remove?

I dont want to use replace/groups, if it can be done in Regex. I want to understand, how I can do a match on “everything else” than the id tag.

Hello

What you are wanting to not capture is in the middle of your string/sample. Regex is a tool to find a pattern of text. Unfortunately it is not possible with just one pattern with one result.

Not unless @yoichi or @ppr know of something.

Now you have only provided 1 sample, the output and limited info, so we have to make assumptions.

Assuming all your samples are the same then you could try this workaround but it’s not the traditional/proper/best use of Regex.

Try this pattern
<meta\s|name.*
In a matches activity.

The use an assign activity (or write line) like this:
StringResult = MATCHESOUTPUT(0).tostring + MATCHESOUTPUT(1).tostring

*Replace capitals with Matches result variable.

This is an unusual workaround and is likely not endorsed by the likes of @ppr or @Yoichi

Either way, you are going to need a second step of sorts.

I am still unsure of why you are not wanting to use the replace method :confused:

2 simple robust steps are better than 1 complex one.