Regex expression to match a keyword within a string, delimited by spaces, comas and semicolons

jcab · April 6, 2018, 9:56pm

Hi all!

So the scenario is the following one:

I have an Excel file with several keywords.
I have to find all of these keywords within a lot of CV’s. Each one of the CV’s is completely different.
I simply have to detect that the keyword is within the CV.

What I did:
I used the PDF Read activity and the whole text is put into the string “PDFText”.
Then I use a for each activity, that will read each one of the keywords of the Excel file, and within the for each activity I use PDFText.Contains(keyword), to see if the keyword is there.

This seems fairly simple. But imagine that the keyword is “ERP”. And imagine that the “PDFText” contains the word “PowerPoint”. This would mean that PDFText.Contains(keyword) would give TRUE. And this is not what I want. I want to detect the word “ERP” as a separate word, thus not being a substring of a given word.

So I think that the best solution would be do a match of the string “PDFText” with a Regex expression, with a given pattern.
The pattern would consist of the “keyword”, and before and after the “keyword” there must be either a space, a coma or a semicolon.
Furthermore, if the keyword is in the beginning of the string (or in the beginning of the line?), there will be no space, coma or semicolon before the “keyword”, so this should also be reflected on the Regex pattern. The same if the keyword is in the end of the string (in the end of each line of the string?)

Do you think this is the best way to detect a given keyword as being an independent word and not being a substring of a given string?

Could someone please indicate what the exact pattern expression would be?

Thanks in advance!

jcab · April 7, 2018, 7:39am

Hi again,

I don’t want to scare you with the large text I wrote in my previous message

I just want to find the word “keyword” within a string with a regex pattern. The pattern would consist of the “keyword”, and before and after the “keyword” there must be either a space, a coma or a semicolon.

Thanks

fudi5 · April 7, 2018, 8:44am

Hi @jcab,

Im not pro here. But i have some idea. If you know keyword you can try in for each put switch… which will be string and then assign value to variable. But you need to cut this string. Or if you know all needed Keywords then you can do arrayString{“ERP”,“PowerPoint”} and check if they are or not in a loop.

Regards
@fudi5

arivu96 · April 7, 2018, 9:43am

Hi @jcab,

Can you provide sample input string and you expected output also.so we can understand ur requirement clearly.

Regards,
Arivu

tmays · April 8, 2018, 6:08pm

Hey @jcab,

If you’re wanting to use regex, you can use something like

"\b(?i)" & keyword &" \b"

for your match pattern.

\b - The match must occur on a boundary between a \w (alphanumeric) and a \W (non-alphanumeric) character. This works at start/end of line too.
(?i) - Use case-insensitive matching.

I would call my regex skills mediocre at best so I always use these two resources. I find online regex tester a very useful tool when trying to work out the right pattern.

Regex Tester
Regular Expression Language - Quick Reference

jcab · April 8, 2018, 6:47pm

Thanks.

I’m currently using the following regex pattern:
“\b”+keyword+“\b”

But for some reason I’m unable to detect the keyword = C#, meaning that if the pattern is “\bC#\b” I’m unable to detect C# as being a separate whole word.

Ashwini_Kumaran · November 16, 2018, 1:28am

@tmays - I have a Table of rows(String) which contains date as “EndDate”, this EndDate is sometimes mentioned as “enddate” and the date format also differs from 08/21/2018 to 21 August 2018. How can I extract the date alone. Any idea?

Topic		Replies	Views
Regular expression to detect keywords ERP and C# and C++? Help regex	7	3061	November 20, 2019
Regex to check if keyword is found in a string and keyword should be an exact match Studio studio , regex , question , activities_panel	6	1234	March 20, 2023
Specific keyword matching in string using Regular expression Help	7	3283	August 8, 2019
Regex to check if a string separated by pipeline contains a certain keyword Studio studio , regex , question , activities_panel , regular-expression	9	559	July 12, 2023
RegEx syntax Help regex	5	746	July 23, 2020

Most Active Users - Yesterday
ashokkarale
MD_Farhan1
Ajay_Mishra
postwick
Dheerendra_vishwakarma
Anil_G
chandreshsinh.jadeja
Gautham_Pattabiraman
vrdabberu
aravindbalineni123
More details...

Regex expression to match a keyword within a string, delimited by spaces, comas and semicolons

Related Topics