Compare Names

KevinDS · July 13, 2021, 10:48am

Hi, I have this question:

I’ve got 2 columns that I need to compare to continue

I need to compare in the same row the names in both headers to continue. We know as humans that Charlie Woods and Woods Charlie is the same person, Anne Carlton and Anne K. CARLTON also.

How can I do this comparison to have a percentage of match between the names, to know that if 80% match I can continue…

jeevith · July 13, 2021, 11:10am

Hi @KevinDS,

A while back I had a developed a Azure function which can take two strings and respond back with Levenshtein Distance of the two strings. Basically, how similar are the words or the letters or the combinations of letters and words in the two strings.

The problem comes under the realm of statistics, lingusitcs and computer science fields and is know as calculating the Edit Distance. In short, what is the lowest number of edits required on one of the strings such that it matches the string being compared. The lesser the number of edits, the closer the strings are to each other.

It is a very vast field in academia so you may find other algorithms which can perform the Levenshtein Distance. For example, the Jaro–Winkler distance Or Hamming Distance for strings with same lengths.

In this implementation, I used the FuzzyWuzz library in Python to make the azure functions.

You can use the HTTP Request activity (GET) in UiPath

URL syntax:
https://pyautomata.azurewebsites.net/api/stringmatcher?stringA=stringA=YOURFIRSTSTRING&stringB=YOURSECONDSTRING

Your example:
https://pyautomata.azurewebsites.net/api/stringmatcher?stringA=Charlie Woods&stringB=Woods Charlie

The output json will be:

{ "Input stringA": "Charlie Woods",
  "Input stringB": "Woods Charlie",
  "Ratio": {
    "Description": "Calculates Levenshtein distance similarity ratio of two input strings.",
    "Value": 54
  },
  "Partial_Ratio": {
    "Description": "Performs a substring matching by matching using the shortest string and recursively matching with all substrings.",
    "Value": 70
  },
  "Token_Sort_Ratio": {
    "Description": "Sorts the words in the input strings alphabetically and then calculates the Levenshtein distance similarity ratio of the two modified input strings.",
    "Value": 100
  },
  "Token_Set_Ratio": {
    "Description": "Considers common tokens in the two input strings and calculates the Levenshtein distance similarity ratio of the two modified input strings. Recommended to be used when the difference in length between two input strings is significant.",
    "Value": 100
  },
  "Documentation": {
    "Blog": "https://chairnerd.seatgeek.com/fuzzywuzzy-fuzzy-string-matching-in-python/",
    "GitHub": "https://github.com/seatgeek/fuzzywuzzy"
  }
}

Hope this clears things a bit.

kumar.varun2 · July 13, 2021, 11:13am

@jeevith

Very informative

kumar.varun2 · July 13, 2021, 11:57am

Hi @KevinDS

I have designed a simple workaround. I am attaching the workflow. Let me know if it works for you.

Compare Names.xaml (9.1 KB)

Regards
Varun Kumar

wjoubert · August 4, 2021, 2:28pm

Thank you, this can come in very handy. I’ll just build on it a bit.

amithvs · February 22, 2023, 1:10pm

Hi, can you confirm whether this api service is working for you now? I tested this on an online api tester and it was not working for me. Do we have to pass any parameters or auth methods?

amithvs · February 22, 2023, 1:16pm

Hi Kevin, May I know which method you used to achieve this?

jeevith · February 22, 2023, 1:33pm

Hi @amithvs,

I have been working on providing you my observations to the question you asked here : Invoke Python Method: Pipe is broken - #19 by amithvs

I have a workaround to achieve what you want, but it is a workaround without using Python Scope or Invoke PowerShell. I am testing out my solution. More on it in that thread soon.

For the above API link, I did not know the API was down. Looks like a random Azure Function bug which has rendered my function unstable. Read more here : Random "The service is unavailable." and "Azure Functions runtime is unreachable" errors · Issue #8583 · Azure/azure-functions-host · GitHub

I can update here when the function is up and running again. Use it for only development purposes, I have a set monthly budget on the app service. After which the API will be stopped.

amithvs · February 22, 2023, 1:53pm

Thanks, Jeevith for the update. I have updated a few of my findings in the other thread.

jeevith · April 17, 2023, 3:28pm

I have the function updated now on Azure Functions (HTTP Trigger)

And also have the same logic running on AWS lambda

Feel free to use them for development purposes.

Topic		Replies	Views
Compare Strings Return true if 80%of Likely match Studio activities , string , question	11	10416	February 26, 2024
Find string partial simmilarity percentage Studio studio , question , activities_panel	6	1287	February 22, 2023
UiPath Studio compare strings using Levenshtein Distance Algorithm \| VB.net code in description Video Tutorials string , compare , string-manipulation	1	2658	March 23, 2022
Fuzzy match Studio studio , question , template	5	2359	January 29, 2021
Fuzzy Matching/ Not exact match activity Help activities	6	7911	January 28, 2021

Most Active Users - Yesterday
sharazkm32
singh_sumit
ashokkarale
lrtetala
prashant1603765
sonaliaggarwal47
Justin_Tan_Jun_Song_EE
Anil_G
mively
shrikrushna.bhoi
More details...

Compare Names

Related topics