RegEx for extracting value from string with trim

Hello! I have a string with a lot of information that i need to extract a value from.

The string contains alot of information from a website and the value in it is:

“Number 1. Amount: 43 dollars”
“Number 2. Amount: 22 dollars”
“Number 3. Amount: 23 dollars”
etc.

I only want to get the value 3. What i want to extract is only (in this example) 23 dollars, and the amount can vary and so can the order of the list (Number 1, 2, 3, etc.). So what i want to do is use a reg ex that starts with "Number 3. Amount: ", and trim the rest.

My current code takes all the values from the rest of the page starting from Number 3. and ending with the last time “dollars” occurs. Can someone please help me alter it so that it trims the rest instead?

System.Text.RegularExpressions.Regex.Match(EntirePage,"((?<=Number 3. Amount:).*(?= dollars))").Value

Hi welcome to the community!
This seems too simple to use regex, i recommend just removing the part you dont need from the string if it is always the same part you need to remove like:
result = "Number 3. Amount: 23 dollars".Remove(0, 17)

Hi! I did a simplified version of the string. The only thing that is unique is that i want to extract the value after “Number 3. Amount:” (size of it and values around it can vary).

What does 17 indicate?

Remove function => 0 is the position to start deleting and 17 is how many chars to remove.

Pretty new with Regex, but looking online for some example and come up with this:

(?<=Number 3. Amount: )(.*)(?= dollars)

What about if i don’t know how many chars to remove?
What i mean is, the value could vary from time to time. Sometimes it could be:

Number 2: Amount 24 dollars
Number 5: Amount 26 dollars
Number 8: Amount 22 dollars
Number 3: Amount 55 dollars
(could go on)

and another could be:

Number 1: Amount 22 dollars
Number 3: Amount 14 dollars
Number 8: Amount 43 dollars

This is why i want to use a trim for everything after, because if the end value is not unique it will take everything until the last time Dollars occurs.

If this is the string:

Number 1: Amount 22 dollars
Number 3: Amount 14 dollars
Number 8: Amount 43 dollars

The result would be this:

14 dollars Number 8: Amount 43 dollars

@agnesv Is it a . or : after Number 3?

no examples you gave had a number different than 17…

Sorry i wasnt being specific, its actually none. This is the correct exact format:

Number 3 Amount:0,23 dollars

(Just an example, Amount can vary)

@agnesv (?<=Number 3 Amount:)\s*(.*)(?=dollars) Try This, Actually your original regex is correct , you needed to just modify a bit

image

that better then? It’s stop as soon its a letter

I’ve tried both the alternatives now, but both times it just starts with the right value and then takes everything that comes after. Does anything look wrong with either of them?

System.Text.RegularExpressions.Regex.Match(EntirePage,"(?<=Number 3 Amount:)\s(.*)(?= dollars)").Value

System.Text.RegularExpressions.Regex.Match(EntirePage,"(?<=Number 3 Amount:)(.*)(?= [^a-z] *)").Value

@agnesv Why do you use . afer Number 3? :sweat_smile:

Haha, sorry. The string actually contains something else but i cannot write it out here so Ive had to modify it, will edit to the correct way

@agnesv Can you share the Text File ?

I cannot share it, but the only difference between my example and the real one is that the string i have does not contain rows. Its structured this way:

Numer 6 Amount:44 dollars Number 3 Amount:0,45 dollars Number 8 Amount:9,5 dollars Number 4 Amount:88 dollars

Could that be why?

@agnesv Yes , You should give the correct information :joy:

1 Like

@agnesv Try This :
(?<=Number 3 Amount:)\s*[0-9.,]*(?= dollars)

That worked! Thanks a million :smiley:

1 Like