To Extract the Date of first occurance

Hi,

I have a string of below kind.

Jane passed away on 3 May 2022, and in her loving memory … on 7 May 2022 something was held and this information was published on 25 May 2022.

Now this is all one string. I want to extract the first occurrence of date based on the key word pass. This keyword could be passing or passed etc.

Please let me know if any solution is there.

Thanks.

To this text in particular you can use a ‘Text to Left/Right’ activity and set the separator as ’ passed away on ’

The resulting variable (hereby named as leftText) of the ‘left text’ will be: " 3 May 2022, and in her loving memory … on 7 May 2022 something was held and this information was published on 25 May 2022."

You can either assign a leftText.split(“,”)(0) (which will return ‘3 May 2022’) to a variable or you can use this code to print the variable. The result will be the same.

If you plan to automate more texts with the same structure, it will work just fine

Hi,

Can you try the following expression?

System.Text.RegularExpressions.Regex.Match(yourString,"(?<=pass.*?)\d{1,2} [A-Z]+ \d{4}",System.Text.RegularExpressions.RegexOptions.IgnoreCase).Value

If there is possibility to match other string which match the above pattern, we can use more strict pattern such as month name etc.

Regards,

It shows pattern error

Hi,

How did you check the pattern? It works well in my environment as the following.

Regards,


The input string is this.

Hi @Reddy_Emani_Jeevan

Take a look at this pattern (preview\play with it here). I have modified the wonderful @Yoichi’s pattern :blush:

System.Text.RegularExpressions.Regex.Match(yourString,“(?<=pass.*?)\d{1,2} (January|February|March|April|May|June|July|August|September|October|November|December) 20\d{2}”).Tostring

This will get the first result each time.

Cheers

Steve

Hi,

If there is possibility your document contains both style date : dd MMM yyyy and MMM dd yyyy , the following will work.

 System.Text.RegularExpressions.Regex.Match(yourString,"(?<=pass.*?)(\d{1,2}[,\s]+(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[.a-z]*|(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[.a-z]*[,\s]+\d{1,2})[,\s]+\d{4}").Value

Regards,