Hello All,
Need your support on the PDF automation.Below is the scenario.
Extract the repeated word in the PDF and count how many time the word repeated.
Hi @Kabeer
Add UiPath.PDF.Activities to your project
Declare the five variables:
MyWord (a String for the word you’re looking for)
Pattern (a String for the pattern used in the regex for matching your word).
DocumentText (a String for the text extracted from the PDF)
WordMatches (a MatchCollection for the regex result)
WordCount (an Integer for your answer)
Use “Read PDF Text” Activity to extract the text from the PDF → DocumentText
Use “Assign” Activity to assign to Pattern that will match your word:
String.Format("\b{0}\b", MyWord)
Use “Assign” Activity to assign to WordMatches the regex looking for the word into the text:
System.Text.RegularExpressions.Regex.Matches(DocumentText, Pattern)
Use “Assign” Activity to WordCount the matches count:
WordMatches.Count
Hi @msan
Read pdf text with ocr and pass the string variable to matches activity and the pattern will be (?<=your word).*
Thanks
Ashwin.S
@ Kabeer
refer
to have more insight on Regex.
What word do you need to count? You can use Regex Match activity, I could help you if you tell me what is the word
Hello msan,
Thanks for the instruction given. If you have any sample project for this scenario. Please share with me to get more clarity.
Hi,
For ex: In the PDF we have 4 pages in all the pages.we need to pick the given word and count the same.
Like : input word - uipath — need to check in all the pages.
Output - The word repeated 4 times
- Use PDF Read activity to read the PDF file, put the result in String
- Use Regex Matches activity to get the desired output
- Use RegexResult.Count to count
Thanks a lot msan