Address Match Percentage

Hello, I’m wondering if any body can help me on the below issue. I have addresses in 2 different excel files. If address “100 Maple drive, suite 1600, CA 54105” is given in File1 and there could be different address in File2 which is listed as

  1. “100 Maple dr., suite 1600, CA 54105”
  2. “100 Maple drive, ste 1600, CA 54105”
  3. “100 Maple dr, ste 1600, CA 54105”
  4. “100 Maple drive, s 1600, CA 54105”

I want to compare the address and get the percentage of each address in File 2.

Thank you,
Chetan

How do you define “percentage” specifically here?

Hi Anthony,
I wanted to compare address which is in File1 and get the percentage by matching address in file2. Basically, word to word matching from both the addresses.
Hope that answers your question.

Is it like this:

If “100 Maple drive, suite 1600, CA 54105” is the comparing address then:

  1. There is a 86% match, since "drive," <> "dr.,".
  2. There is an 86% match, since "suite" <> "ste".
  3. There is a 71% match, since "drive," <> "dr," and "suite" <> "ste".
  4. There is an 85% match, since "suite" <> "s".

In essence, this makes sense, but what about these cases?

  1. “Maple drive, suite 1600, CA 54105”
    What is the match here? If we compare word by word, then there is a 0% match, since "100" <> "Maple", "Maple" <> "drive", etc.

  2. “Maple drive, suite 1600, CA 54105 8362”
    If we use our comparison string and compare as much as possible, we have a 100% match. But if we base our comparison on the longer string, we’ll have an 88% match. Which one should be the base?

  3. “100 Maple drive Apt 25, suite 1600, CA 54105”
    There are 2 problems here. The first is that having “Apt 25” in the string throws off the comparison. the second regards the base string on which to determine the percentage.

You are right Mr. Anthony on the percentages you have derived.
Regarding base file, its always going to be the address mentioned in File1.

Are there activities available for the comparison?

The first four options that you mentioned are correct. i need this calculation in UiPath.

Not that I’m aware of, but strings can be compared. But I still need to understand what you would do if cases 5-7 appear.

in these cases as well, i will select the address which is giving me highest percentage.

In case 5, regardless of which address you pick, you’ll get a 0% match. I want to confirm that this makes sense.

In case 7, we’ll get a 43% match with the original string, and a 33% match if we use the compared string (the longer one). It’s only because the original string is shorter that it gives us a better percentage. Is this the desired result?

Yes, Anthony.

This will process the addresses as specified. The workflow takes an input of a compare string, and an array of strings to which it should be compared. The output is an array of percentages for each comparison in the array of strings.

Processusvierge.zip (2,9 Ko)

Thank you so much for providing the xaml file. Currently I’m testing the scenarios and will let you know.

Thank you once again.

Thank you so much Anthony for this solution. It really helped a lot.

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.