RegEx TEXT is different each time and my reg expression is not working

My Text include

Name, Person ID, Company name, User ID
Amit trivedi, 12345678, Capgemini A/S, (54623)

To get those value out I’m using REGEX –

(.*),(.*),(.*),(.*)

and it gives me 4 value out as I have 4 time Comma in between which seperates things… Result is:
on groups 1 : Amit trivedi
on groups 2 : 12345678
on groups 3: Capgemini A/S,
on groups 4: (54623)
(And its works perfect in 90% cases)

But I got some data where company name is long and include an ekstra comma “,” like

Name, Person ID, Company name, User ID
Rajesh Rathoad, 12389678, Perpetual Income, Growth Investment Trust PLC, (54623)

What my RegEx did, it took first two value in one regex result and rest divided as normal.
like it gave me
on groups 1 : Rajesh Rathoad, 12389678
on groups 2 : Perpetual Income
on groups 3: Growth Investment Trust PLC
on groups 4: (54623)

As my data always have same kind of text like
Name, Person ID, Company Name, User ID

How can I be sure that I wont get regex issue, if my Text changes from long to small or small to long company name with one or many comma’s.

can someone help…

Hi @Ellen

Try this

[A-Za-z ]+(?=\,\s+\d+)

(?<=([A-Za-z ]+\,\s+))\d+

(?<=([A-Za-z ]+\,\s+)\d+\,\s+).*(?=\,\s+\()

(?<=\()\d+

Regards,

If the data fields are always in a fixed order and the number of fields is consistent, you can use a more sophisticated regex pattern to capture the entire company name even if it contains commas. Here’s a pattern that might help:

^(.*?),\s*(\d+),\s*([^,]+(?:,[^,]+)*)\s*,\s*(\(.*\))$

Explanation of the Regex Pattern

  1. ^(.*?),: Captures the Name field up to the first comma.
  2. \s*(\d+),: Captures the Person ID field.
  3. \s*([^,]+(?:,[^,]+)*)\s*,: Captures the Company Name field, allowing it to include commas. This part handles the possibility of multiple commas within the Company Name.
  4. `\s((.*))*: Captures the User ID field, which is enclosed in parentheses.

Regards
Sandy

Maybe the following strategy will better serve:


(.*?),\s(\d+),\s(.*),\s(\(\d+\))

Hi @Ellen

If you want to extract each item with each regular expression, follow below:

[A-Za-z]+[A-Za-z\s]+(?=,\s+\d+)

(?<=[A-Za-z\s]+,\s+)\d+

(?<=\d+,\s+).*(?=,\s+(\d+))

(?<=()\d+(?=))

Hope it helps!!

Hi @Ellen

Try This.Hope it helps

^(.?), (\d), (.*?), ((\d+))


image

Regards,
Samsanditha

If you want to extract by using a single Regular expression, check the below one… @Ellen

[A-Za-z]+.(?=\,\s+\d+)|(?<=\,\s+)(?\d+)?|(?<=\d+\,\s+).(?=\,\s+(\d+)

Hope it helps!!

I tried to used all those but none of them is giving right answer…
Have also tried to use regex101.com to evaluate the regex is mentioned here nothing is showing match to the text. :frowning:

image

image
everything is set but it gives this issue … as reguler expression result is 0.

image

image
everything is set but it gives this issue … as reguler expression result is 0.

image

same result when I tried your pattern as well

When I tried those REGEX one by one with my text to validate at regex101.com with my pattern text. it didt show me result in groups…

I think I have a working regex.
I modified a little bit my old regex and it seems giving right reslut.
It’s:
image

I will test more with different kind of possible text and see if that still works or not.
So i will write back If I need help on same issue.

Thanks for reply and all the help.

Hie @Ellen you can achieve your result using string manipulation too .
here a screenshot attachment


mark this solution if it help you
cheers Happy Automation

Hi @Ellen

Can you use below regex and the Linq query code, let know once if comes any error.

Regex:

([a-zA-Z ]+),([\d ]+),(.*),(.*)

LINQ:

Regex.Match(currentLine,"([a-zA-Z ]+),([\d ]+),(.*),(.*)").Groups.Cast(of Group).Skip(1).select(Function(item) item.ToString)

Note: in place of “currentLine” on the above give your input line of text.

Attached the xaml file.
RegexValueDifferentiator.xaml (8.9 KB)

Thanks

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.