Retrieve Last name with regular expression with a hyphen

I have a PDF that i remove spaces from to be able to accurately extract someone’s name from the middle of the document. It completes the task perfect unless the person has a hyphen ( - ) in their last name. In this case, it only takes the first part of the last name.

I need to be able to get all of it.

Here is what I do so far.

Use Assign:
Name = System.Text.RegularExpressions.Regex.Match(Spaceless,“(?<=KellerPostmanrepresents).+(?=,a)”).ToString

FirstName = System.Text.RegularExpressions.Regex.Split(Name,“(?<!^)(?=[A-Z])”)(0)

LastName = System.Text.RegularExpressions.Regex.Split(Name,“(?<!^)(?=[A-Z])”)(1)

How do I make the last name dynamic so that if the last name is written (i.e., Smith-Johnson) it grabs the entire last name but if it is just (i.e., Smith) it takes just Smith?

Any help would be appreciated.

Can you share the sample input and output? @atarantino

It’s sensititve data however the input would be something:

Input:
The client we are representing is Drew Smith-Johnson in the case against the state.

Output (for LastName):
Smith-

Input:
The client we are representing is Drew Smith in the case against the state.

Output:
Smith

In the first scenario I’d like for it to say Smith-Johnson. But each document I scan is different and some don’t have the -

Try this pattern on last name @atarantino

(?<!^)(?=.[A-Z])

Regards
Sudharsan

So that resulted in the following:

image

In the above picture, the example nSmith is the one that has hyphen. It seems the pattern you suggested took the last letter from the first name and put it in front of the last name.

@atarantino

Can you try this please

(?<!^|-)(?=[A-Z])

cheers

1 Like

Perfect !! thank you !!

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.