Regex solution needed for Read PDF Text activity output

I have a process that reads a PDF to extract a data variable. The file is complex and was created by LIfeCycle Designer. Opening the file is slow and a performance drain so I would rather use a “Read PDF Text” activity and regex the text output if I can. The output from the Read PDF Text contains alot of text but the one value I am looking for is found where the row reads “SECTION I”. Then two rows below that I need the last data value (on the right) which is a login id. Is a regex possible to spot the row that starts with “Section I” and then go down 2 rows and pick the value on the right?

SECTION I
Request Type Request Date Current Loginid
Deactivate 03/30/21 iamauserid

A couple related points. The first 2 rows are always identical. Literally the only data that is unique is the date and loginid. Let me know if there is an option to parse like that.

Hi @jamiejam - So you would like to extract the LoginID value which dynamic in the row underneath the section I? or the value "Iamuserid?

@jamiejam - Assuming there is no other data after IamuserID…you can try this pattern…

Pattern:

output = System.Text.RegularExpressions.Regex.Match(your_string,"(?<=SECTION I\s?.+\s?.+\s?\d{2}\/\d{2}\/\d{2}\s+).+").Value

image
Link: regex101: build, test, and debug regex

Or

output = System.Text.RegularExpressions.Regex.Match(your_string,"(?<=SECTION I\s?.+\s?\w*\s+\d.*\s+).+").Value

image
Link: regex101: build, test, and debug regex

Thanks very much @prasath17 and @Adrian_Star for your expertise on this one. The value I am extracting from the PDF text is preceded and followed by other text. Using (?<=SECTION I: Account Action\s?.+\s?\w*\s+\d.*\s+).+ hits that value the most precisely of the options you noted but it picks up the value in the next section (line of text) that follows after the carriage return.

How can I get the regex to stop reading at that line given that all other lines (sections) that follow will be different. Pull back imauserid in this mix. The regex picks up Section II More Stuff.

SECTION I
Request Type Request Date Current Loginid
Deactivate 03/30/21 iamauserid
SECTION II More Stuff
Blah blah
SECTION III Even More Stuff
Blah blah blah
SECTION IV: So much stuff
Blah blah blah

@jamiejam - I see, it is pulling the Right value with the pattern provided…

Hi,

output = System.Text.RegularExpressions.Regex.Match(your_string,"(?<=SECTION I\s?.+\s?\w*\s+\d.*\s+).+\s+(?=SEC)").Value

Link: regex101: build, test, and debug regex