I need to compare two strings that are similar but not exactly the same.Their might be some additional words in one string or even the words might be interchanged in some places although they represent the same.
For Example:
Google and Google Pvt. Ltd
Computer Generated Solutions and Computer Generated Soltns
Societe Generale and Generale Societe
I need something that can recognize these as the same.
Levenstein Algo doesn’t work beacuse the words need to be in order to work.
Thanks
Raviteja
indra
(Indra)
May 21, 2018, 6:42am
2
@Raviteja94 compare two string using contains in the if condition
@indra It won’t work for 2nd and 3rd scenarios.
arivu96
(Arivazhagan A)
May 21, 2018, 6:50am
4
Hi @Raviteja94 ,
Refer this link
http://www.dotnetworld.in/2013/05/c-find-similarity-between-two-strings.html?m=1
Based on the percentage you can conclude the match string
Similarity in % between Strings
public static void SmilarityinPercentage()
{
string string1 = "Manish";
string string2 = "Mahesh";
char[] charString1 = string1.ToCharArray();
char[] charString2 = string2.ToCharArray();
var strCommon = charString1.Intersect(charString2);
//Formula : Similarity (%) = 100 * (CommonItems * 2) / (Length of String1 + Length of String2)
double Similarity = (double)(100 * (strCommon.Count() * 2)) / (charString1.Length + charString2.Length);
Console.WriteLine("Strings are {0}% similar", Similarity.ToString("0.00"));
}
//Output:- Strings are 66.67% similar
Similarity in % between Arrays of String
public void SmilarityinPercentage()
{
string[] string1 = new string[] {"Manish","Dubey", "Dot", "Net","World" };
string[] string2 = new string[] { "Dot", "Net", "World" };
var strCommon = string1.Intersect(string2);
//Formula : Similarity (%) = 100 * (CommonItems * 2) / (Length of String1 + Length of String2)
double Similarity = (double)(100 * (strCommon.Count() * 2)) / (string1.Length + string2.Length);
Console.WriteLine("Strings are {0}% similar", Similarity.ToString("0.00"));
}
//Output:- Strings are 75.00% similar
Similarity in % between String Sentences
public void SmilarityinPercentage()
{
string string1 = "My blog name is Dot Net World";
string string2 = "Dot Net World";
string[] splitString1 = string1.Split(' ');
string[] splitString2 = string2.Split(' ');
var strCommon = splitString1.Intersect(splitString2);
//Formula : Similarity (%) = 100 * (CommonItems * 2) / (Length of String1 + Length of String2)
double Similarity = (double)(100 * (strCommon.Count() * 2)) / (splitString1.Length + splitString2.Length);
Console.WriteLine("Strings are {0}% similar", Similarity.ToString("0.00"));
}
//Output:- Strings are 60.00% similar
Regards, Arivu
1 Like
@Raviteja94
You can use the cognitive Service activities
The cognitive activities pack helps you use Google's, IBM's, Stanford's and Microsoft's APIs, and automatically process the information that they help you extract. The package enables you to translate text from one language to another, as well as...
@Madhavi Can you briefly explain how I can use this for my solution.
Thanks
Raviteja
Ben_Ten
(Benjamin Wied)
November 28, 2018, 12:19pm
7
@Madhavi
I have looked at the Cognitive Activities Package, but i am not sure, how this is supposed to help with a Fuzzy String Comparison.
I would like to use this in Order to look for specific Values in a PDF file. I would like to be able to not just use for the 100% Exact term and at the moment i have only RegEx, perhaps this could be useful.