RESUME NAME EXTRACTION USING REGEXR

Hello Forum,
I wanted to extract the “names” from resume using REGEX…It should output the names with any format of the names …for eg: 1) in one resume it will have NAME:XXX in the first line , 2)In other it will just have the name in the first line 3) Also in few resumes the name column will be in the second line,in that case i need to extract names from second line…So all these conditions should satisfy inside the matches condition when using regex…

Can somebody pls help me out !

@Soundarya_L

Can you confirm that NAME: will be there even it changes the position?

If it is fixed then check as below
image

Share some examples, so that we get to know clearly

Thanks

Hi
“NAME:” will not be there in few resumes and its not in fixed position also. Example:

  1. NAME: RAMA LAKSHMI
    2)RAMA LAKSHMI
    3)RAMA LAKSHMI A.P
    4)A.P RAMA LAKSHMI

Also in few word resumes, RAMA LAKSHMI will be there in the first line and in other resumes , it will be in the second line …So i want a regex that works for all four conditions mentioned above.

Hello

Try this pattern in a matches activity:
(?<=NAME:\s|2)|3)|4)).*

Cheers

Steve

Hey Hi , Thanks for the reply.
But “NAME:” will not be there in every resumes… If i give this condition ,it will output only the names which has “NAME:XXX”… I want a format which works for above 4

Hello

Yep,

It will capture all names based on your sample provided regardless :slight_smile:

Then use an assign activity to convert to string. Like this.
StringName = MATCHESRESULT(0).tostring

Replace capital letters with actual Matches output.

I guess you have misunderstood my question… One word document would have NAME:XXX in the name part…other word document will have just the name part with initial at the back…another word document will have initial at the front and name at the back…I want only the name from that word document

Hello

I’ll modify the Regex :slight_smile:

Will the number and initials be the same always?

I think it might be best to do it in two steps. :blush:

Use the Regex above.
Then use a “Replace” activity.
Pattern will be “[A-Z]\.[A-Z]”
Replacement value will be “”

Hey thanks for the reply… But again your code works only for “XXX.YY”(Names with initials)… But I want a format where that regex should work for different kind of formats say RAMALAKSHMI, RAMA LAKSHMI, RAMA LAKSHMI A.P , A.P RAMA LAKSHMI…

We don’t know in resumes how candidate will mention their names…It can be of any format… So I want to extract the names from resume

Hello again

Names are notoriously tough. But maybe give this pattern a try:
(?<=NAME:\s|2\)|4\)[A-Z.]+\s).*|(?<=3\)).*(?=\s[A-Z].[A-Z])

Hopefully this helps :slight_smile:

Hello .Thankyou for taking the time to explain .But the numbers “1) 2) 3) 4)” I mentioned was just for understanding… I do not want those numbers in the result… I just want the names(with initials at the front/back) basically the regex should print out any kind of name format mentioned in the resume. Hope this is clear .

Hello

Are you able to provide some more samples so I can understand more :slight_smile:

Hi
Think of 10 resumes you’re getting as an recruiter… I want to extract the name from those resumes… Name part can be in any formats right… We don’t know how the candidate will be having…So i want a regex for that

Hello

Regex relies heavily on having a pattern so Regex might not be the best tool to use for this scenario…

1 Like