I have tried it earlier actually output data is coming from pdf (which can be 500 words long) and address is part of output data.
This algorithm is giving less % result.
Actually the problem statement is to check whether that address is present in the pdf or not .(I cannot use string.contains method since addresses are of fuzzy type)