Regex Pattern for data extraction from file names

I have some filenames in .html format. from those names i want to extract some content. Sample file names are as below:

20210608_081442_20-0000428456.html
20210608_081442_328456.html

I need to extract the content in between last “_” and “.html”. Highlighted in above example. Please suggest the regex pattern for same

Hi,

Can you try the following?

System.Text.RegularExpressions.Regex.Match(yourString,"[^_]+(?=\.html)")

or

System.Text.RegularExpressions.Regex.Match(System.IO.Path.GetFileNameWithoutExtension(yourString),"[^_]+$")

Regards,

3 Likes

@BTanaji - you can use file name without extension and split by _ and take last.

1 Like

Hi @BTanaji

Here is Regex Condition

Activity

-Use Assign activity
Create variable = System.Text.RegularExpressions.Regex.Match(Input_Value,"[^_]+(?=.html)")

If you face any issues, let me know

Regard,
Gokul

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.