Hi,
I have a string of below kind.
Jane passed away on 3 May 2022, and in her loving memory … on 7 May 2022 something was held and this information was published on 25 May 2022.
Now this is all one string. I want to extract the first occurrence of date based on the key word pass. This keyword could be passing or passed etc.
Please let me know if any solution is there.
Thanks.
jpbelchote
(João Pedro Oliveira Belchote)
May 28, 2022, 3:31pm
2
To this text in particular you can use a ‘Text to Left/Right’ activity and set the separator as ’ passed away on ’
The resulting variable (hereby named as leftText) of the ‘left text’ will be: " 3 May 2022, and in her loving memory … on 7 May 2022 something was held and this information was published on 25 May 2022."
You can either assign a leftText.split(“,”)(0) (which will return ‘3 May 2022’) to a variable or you can use this code to print the variable. The result will be the same.
If you plan to automate more texts with the same structure, it will work just fine
Yoichi
(Yoichi)
May 28, 2022, 10:25pm
3
Hi,
Can you try the following expression?
System.Text.RegularExpressions.Regex.Match(yourString,"(?<=pass.*?)\d{1,2} [A-Z]+ \d{4}",System.Text.RegularExpressions.RegexOptions.IgnoreCase).Value
If there is possibility to match other string which match the above pattern, we can use more strict pattern such as month name etc.
Regards,
Yoichi
(Yoichi)
May 29, 2022, 1:52pm
5
Hi,
How did you check the pattern? It works well in my environment as the following.
Regards,
Yoichi:
System.Text.RegularExpressions.Regex.Match(yourString,"(?<=pass.*?)\d{1,2} [A-Z]+ \d{4}",System.Text.RegularExpressions.RegexOptions.IgnoreCase).Value
The input string is this.
Hi @Reddy_Emani_Jeevan
Take a look at this pattern (preview\play with it here ). I have modified the wonderful @Yoichi ’s pattern
System.Text.RegularExpressions.Regex.Match(yourString,“(?<=pass.*?)\d{1,2} (January|February|March|April|May|June|July|August|September|October|November|December) 20\d{2}”).Tostring
This will get the first result each time.
Cheers
Steve
Yoichi
(Yoichi)
May 29, 2022, 11:35pm
8
Hi,
If there is possibility your document contains both style date : dd MMM yyyy and MMM dd yyyy , the following will work.
System.Text.RegularExpressions.Regex.Match(yourString,"(?<=pass.*?)(\d{1,2}[,\s]+(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[.a-z]*|(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[.a-z]*[,\s]+\d{1,2})[,\s]+\d{4}").Value
Regards,