RegEx Manipulation Pulling two different variables from one string

Hey all,

So here’s the lowdown. Trying to pull two different values out of each file name that is listed in a folder. The sequence is ForEach item in Directory.GetFiles(“Data\SomeFolder”)

Now each of the files is going to be a pdf and I need to pull certain information out of each file name. The structure of the file name is as follows.
##-####### space NAME OF FILE WITH SPACES SPACE followed by 2019 Federal Tax Return.pdf

Example: 22-1234567 This Is The File Name LLC 2019 Federal Tax Return.pdf

I’ve played around with regex101.com and have been able to pull the first variable by using (^\d{2}\D\d{7}) which ends up pulling 22-1234567. Now all I need to pull is the rest minus the 2019 Federal Tax Return.pdf

Example: This Is The File Name LLC

Need to trim the white space on both ends also.

Any help on this would be amazing. Thanks In Advanced. Happy Automating!

is it always going to be 2019 there? or maybe always a year?

It is

(?<=\d{1} )(.*)(?= 2019)

1 Like

You’re A Boss! Thank you!!!

1 Like

So now I’m running into this. Any clue as to how to remove the leading part of the file path from the source directory so that I’m left with the true file name? Is that something I can do in the ForEach statement or is that something I’m going to have to do through RegEx?

@BotMonkey

fileName = Split(stringAsAWhole,“SourcePDFs\”)

(assuming the filename starts after "SourcePDFs" - you have blacked it out, so I am not sure.)
fileName is a new variable of type Array[String].
The file name will be available in fileName(1).

1 Like

Thank you!

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.