Good afternoon, I am carrying out an automation and I have to obtain the text of a scanned pdf and from everything obtained I need to search for a specific word and show what follows after that word. Can someone help me please.
For example, I have a text that says “Please read our Forum FAQ - Beginner’s Guide before creating a new post. If you want to report a bug”, I need you to bring me everything that goes after “post” and “report”.
1. Read PDF with OCR (output variable: scannedText)
2. Assign postIndex = scannedText.IndexOf(“post”) + “post”.Length
3. Assign reportIndex = scannedText.IndexOf(“report”) + “report”.Length
4. Assign textAfterPost = scannedText.Substring(postIndex)
5. Assign textAfterReport = scannedText.Substring(reportIndex)
6. Log Message: "Text after ‘post’: " + textAfterPost
7. Log Message: "Text after ‘report’: " + textAfterReport
Once you have read the pdf using read pdf or read pdf with ocr activity and got the specific word you want to get text from.
You can try using the regex (?<=Word_Var).* → Where It gives all words after word_var in the same line.
Use read pdf text with ocr store the result in string
Use regex or string manipulations to extract the particular data in the string
This will help you to extract text from string
@Quenton_Wayne_Rebello, can you please share how the designer did. It performs as the observance tells me, but I get a boolean result.
You are using Is Text Matching activity it gives boolean output.
Use Find Matching Patterns activity
You are using Regex.IsMatch so it checks if there is a match in the string with the regex. Instead you can do System.Text.RegularExpressions.Regex.Match(String,Regex).Value.ToString
The above can be directly used in an assign activity. So left will be variable you want to save it in and right will be System.Text.RegularExpressions.Regex.Match(String,Regex).Value.ToString.
If you want to do using activities, use find matching pattern, where result will give an array of matches and first Match will give a string result.
This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.