Hi Team
I am finding holiday dates using regular expressions. I have an Excel file as shown in Fig.
Input:-
“APR 202314 FriDr.Baba Saheb Ambedkar Jayanti”
“MAY 202301 MonMaharashtra Day”
“JUN 2023No Holidays”
“JUL 2023No Holidays”
“AUG 202315 TueIndependence Day30 WedRaksha Bandhan”
“SEP 202328 ThuGanesh Visarjan/Eid-E-Milad”
“OCT 202302 MonMahatma Gandhi Jayanti24 TueDashera”
“NOV 202315 WedBhaubij”
“DEC 2023No Holidays”
“JAN 2024No Holidays”
“FEB 2024No Holidays”
“MAR 2024No Holidays”
Expected output:-
APR 2023 14
MAY 2023 01
AUG 2023 15
AUG 2023 30
SEP 2023 28
OCT 2023 02
OCT 2023 24
NOV 2023 15
How can I find it?
Thanks
Minal P
1 Like
@minal.patil
Try this pattern:
\b[A-Z]{3}\s+\d{4}\s*\d{2}\b
1 Like
ppr
(Peter)
October 27, 2023, 1:19pm
3
A quick pattern, maybe too tolerant as AAA would also be accepted as Month Code
And can use the groups for the output splits
[CheatSheet] - System.Text.RegularExpressions | RegEx - News / Tutorials - UiPath Community Forum
1 Like
Hope the below steps would help you resolve this
Use a read range activity and get the output as dt
Then use a assign activity like this to get the no holdiays list first
dt = dt.AsEnumerable().Where(Function(a) NOT a.ToString.Contains(“No Holidays”)).Select(Function(b) Datetime.ParseExact(b.ToString.SubString(0,10).Trim, “MMM yyyydd”, System.Globalization.CultureInfo.InvariantCulture).ToString(“MMM yyyy dd”))
This give rhe converted date format u want as a output
Then u can write back to the excel with WRITE RANGE activity
Cheers @minal.patil
Parvathy
(PS Parvathy)
October 27, 2023, 1:39pm
5
Hi @minal.patil
Try this:
Input= "APR 202314 FriDr.Baba Saheb Ambedkar Jayanti
MAY 202301 MonMaharashtra Day
JUN 2023No Holidays
JUL 2023No Holidays
AUG 202315 TueIndependence Day30 WedRaksha Bandhan
SEP 202328 ThuGanesh Visarjan/Eid-E-Milad
OCT 202302 MonMahatma Gandhi Jayanti24 TueDashera
NOV 202315 WedBhaubij
DEC 2023No Holidays
JAN 2024No Holidays
FEB 2024No Holidays
MAR 2024No Holidays"
Output= System.Text.RegularExpressions.Regex.Match(Input,"\b[A-Z]+\s+\d+\b").Value
Str_out= DateTime.ParseExact(Output.ToString,"MMM yyyydd",System.Globalization.CultureInfo.InvariantCulture).ToString("MMM yyyy dd")
Hope it helps!!
lrtetala
(Lakshman Reddy)
October 27, 2023, 5:11pm
6
Hi @minal.patil
Please find the below xaml for your reference
BlankProcess14.zip (78.1 KB)
O/P:
TestingSheetdate.xlsx (12.8 KB)
Hope this helps!!
and also how to capture 'Independence Day30 ’ and 'Mahatma Gandhi Jayanti24 ’
also how to capture 'Independence Day30 ’ and ‘Mahatma Gandhi Jayanti24’.
ppr
(Peter)
October 30, 2023, 12:52pm
9
from the above screenshot, we do see it caught:
ppr
(Peter)
October 30, 2023, 1:05pm
12
just work carefully. We didn’t use a space between year and day. Also it is not present within your text pattern. Same also for the part after the day
Sorry for my mistake.
Also, find these two dates
AUG 2023 30 means 30
OCT 2023 24 means 24
Note:- In Aug month there are two dates 15 and 30 or more than two. I am finding both dates. same as Oct.
ppr
(Peter)
October 30, 2023, 1:28pm
14
As long you will have no match within the preview it is different
Whenever a unmatched parenthese (also idicated as red) is reported, then correct the pattern as shown above
we do not recommend to change again and again the text pattern on which we orient to find out the regex pattern. Just name the requirements right at the beginning completely
Maybe following helps, but then trim the groups
@minal.patil
Try this pattern:
[A-Z]{3}\s*\d{4}\s*\d{2}.*\d\b|[A-Z]{3}\s*\d{4}+\s*\d{2}
Note that this returns the complete string “OCT 202302 MonMahatma Gandhi Jayanti24” and you’d have to do some string manipulation to get the text off.
I am highlight the date ,month and year.
Not working this pattern:
[A-Z]{3}\s*\d{4}\s*\d{2}.\d\b|[A-Z]{3}\s \d{4}+\s*\d{2}
Hi @minal.patil ,
Could you maybe check with the below workflow :
Regex_FindHolidayDates.zip (9.8 KB)
ppr
(Peter)
October 31, 2023, 8:38am
19
we could extend the Regex Pattern to:
And referering / checking the groups:
Still we do feel that the input has a certain risk of unreliability.
Depending on the input (single string / full text with line breaks) we would suggest to combine it with some other techniques to control the complexity of the regex part
system
(system)
Closed
November 3, 2023, 8:38am
20
This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.