Splitting a string of file names into an array of file names

Hey everyone! I’m trying to figure out how to take a string of file names that doesn’t have any definitive separator and split it into an array of strings containing each file name.

Example input: “Test 1.pdfTest2.docxTest3.pdfTest 4.jpg”
Desired output: {“Test 1.pdf”, “Test2.docx”, “Test3.pdf”, “Test 4.jpg”}

So far I’m looking at using regex to split the input string, but the only idea I can think of is something like this: System.Text.RegularExpressions.Regex.Split(in_Var, “..{4}”)

This doesn’t accurately split extensions of varying lengths or include the extensions into the outputs. The output is: {“Test 1”, “est2”, “Test3”, “est 4.jpg”}

Any help or guidance on this would be great!

Hello, @geneddie - Is the first letter of the filename always uppercase? If yes, use this:

System.Text.RegularExpressions.Regex.Matches(in_Var, "[A-Z][\w\s]+(\.[a-z]+)")

If the first letters are not always uppercase, and the filetypes are known, use this:

System.Text.RegularExpressions.Regex.Matches(in_Var, "([A-Z][\w\s]+\.)(jpg|pdf|docx)")

Both work to capture this result:
image

Thanks!

Hi,

I think it’s difficult to separate them correctly if there is no rule.For example, “xyz.xlsxyz.dox” can be separated into xyz.xlsx and yz.docx OR xyz.xls and xyz.docx.

If we know extensions which they have in advance, the following might help you.

yourString="Test 1.pdfTest2.docxTest3.pdfTest 4.jpg"
arrExts={"jpg","docx","xlsx","pdf","pptx","txt"."png"}

then

files = System.Text.RegularExpressions.Regex.Matches(yourString,".+?\.("+String.Join("|",arrExts)+")",System.Text.RegularExpressions.RegexOptions.IgnoreCase).Cast(Of System.Text.RegularExpressions.Match).Select(Function(m) m.Value).ToArray()

Regards,

Thanks for your help and guidance so far! The files are not always going to have the first letter capitalized nor will they be limited to any specific file types, unfortunately.