False Positive for String Variables

Hello Community

The use case: Internal customer has PDFs (over 500!) that need to be searched.
Robot path: capture all text from PDF and assign to string: TextOutput

Also, there is an Excel column of keywords to search for occurrence in the document.
Robot path: for each row in dt_keywords, if TextOutput.Equals((CurrentRow(KeywordColumnName))

A column name is created with title of each PDF read
Robot path: PDFColumnName = PDFName

Underneath each PDFColumnName, occurrences are marked with ‘x’
Robot path: Then Assign CurrentRow(ColumnName) = “X”

Issues: False Positive. (during test of only three docs)
Example, the list of keywords contains F-1, F-2, F-15, and F-21.
(Actually over 2000 variations and subvariations of terms)
F-15 and F-21 are in the document and marked correctly.
However F-1 and F-2 are marked incorrectly as false positives.

Thoughts: Create another loop; assign each match to a string. If that string length matched the current row length, mark with ‘X’. It might even be possible to count occurrences this way. But I’m stuck with specifics.
Or maybe thinking about it wrong (very likely).

Any suggestions?
Thank you

That’s because “F-21” contains “F-2”. A simple way to solve this is not to look for “F-2” but to look for " F-2 " (assuming there is always a space before and after the values you’re looking for). If you can’t assume that, then you’ll have to use RegEx.

1 Like

Thank you @postwick
The list of keywords has nearly 2000 rows,
so neither putting a string before or after (tried this before posting)
nor regex is a solution here.
Appreciate your advice

This doesn’t make any sense. You’re supposed to loop through the list of keywords and use the “space before and after” or RegEx to check if each one exists in the data.

HI,

Can you share specific sample? It’s no problem if dummy data.

Regards,