Fuzzy Matching in string comparison

excel
activities

#1

Hi,

My ask is to search for a string from one excel in to another one.Using this ask what i did was "I tried to find whether a string or a sub string of the string is found in the other excel sheet making space as a delimiter between words of a string "

Now the client wants that i should do a fuzzy matching of strings similar to the one as done by Fuzzy Add on in Excel .The question is can this be achieved using UiPath ?
If yes then how ?
Please it is urgent do reply fast.
Thanks in advance


Matching a string based on similarity i.e. not exact match
#2

I faced a similar situation recently, and from what I’ve seen the nearest alternative is the Matches activity, which uses regular expressions to find a certain substring in a string.

It works only on certain situations, though. Let’s say in one excel you have “Ed Jones” and the other one has “Edward Jones”. You could set up the matches activity to look for the whole string (Edward Jones) or to look for each word at a time (First check if “Ed” is in the first word of the other column, which it is in this case, then Jones. If both match, then it’s probably the same person)

It doesn’t quite work the other way around, though. If your input was “Edward Jones”, then it wouldn’t find the word “Edward” in “Ed”, so it would fail. It all depends on your specific needs, really.

If your client wants a fuzzy matching similar to the fuzzy add on, then maybe your best bet would be to just automate Excel to perform the fuzzy matching.


#3

But is that excel automation reliable ? Considering we will be dealing with lot of records ?
We will be using excel as an UI .So is it a good way of doing it ?


#4

If you’ve used the Fuzzy Add on, you know that it isn’t 100% exact but it deals with something called a “Similarity Treshold” (The higher it is, the less likely it is to make a mistake, but also less likely to find a match).

Without looking at your table it’s hard to tell if it would work or not. It shouldn’t have a problem matching, say “001 002 23” and “00100223A”, for example. But the more different the strings are, the likelihood of error also increases.


#5

So i can details about the requirement.
I have two excels:
excel1-contains a column called employerName
excel2-contains a columns called as sEmployerName
Requirement - for each employerName in excel 1 we need to find a similar match in excel2 in sEmployerName column.
Currently what i did was i tried to find that whether the full employerName in excel1 is found in sEmployerName of excel2 and if not i tried to break the employerName from excel1 into words and then tried to find if either any word from employerName was found in sEmployerName .
But this not what is required.
Now we need to find similar matches from employerName in excel1 with sEmployerName in excel2


#6

Hi mudit and smassau,

I am also looking for the same fuzzy match algorithm. If you have done something, then can you please attach the .xaml file?

Thanks for the help.

Thanks,
Dev


#7

Hi ,

We did a workaround as we did not find a stable solution for fuzzy algo