Hello Community
The use case: Internal customer has PDFs (over 500!) that need to be searched.
Robot path: capture all text from PDF and assign to string: TextOutput
Also, there is an Excel column of keywords to search for occurrence in the document.
Robot path: for each row in dt_keywords, if TextOutput.Equals((CurrentRow(KeywordColumnName))
A column name is created with title of each PDF read
Robot path: PDFColumnName = PDFName
Underneath each PDFColumnName, occurrences are marked with ‘x’
Robot path: Then Assign CurrentRow(ColumnName) = “X”
Issues: False Positive. (during test of only three docs)
Example, the list of keywords contains F-1, F-2, F-15, and F-21.
(Actually over 2000 variations and subvariations of terms)
F-15 and F-21 are in the document and marked correctly.
However F-1 and F-2 are marked incorrectly as false positives.
Thoughts: Create another loop; assign each match to a string. If that string length matched the current row length, mark with ‘X’. It might even be possible to count occurrences this way. But I’m stuck with specifics.
Or maybe thinking about it wrong (very likely).
Any suggestions?
Thank you