hello guys,
I am currently working on finding exact text that is inputted by the user in the pdf and get page number .I can extract the whole pdf using read pdf activity but i donβt have any idea how to implement the next step(find the text in the pdf ).I just want to get the page number of that text.I am newbie so if please share your thoughts/solution here it will be helpful to me.
Hi
Take a look here. You can use Regex to extract, confirm if text exists etc.
Here is a quick video tutorial.
Hopefully this helps.
Cheers
Steve
Thanks Steven this video really helped me to implement the next step.
hi @Vaibhav_Rajpoot_17 ,now i have doubt related to regex.
here the situation:
i am going to get three input words from user and using regex i am going to check whether the three words are in the pdf page at a time.
ex:
word 1:sun
word 2:moon
word 3:star
i need to check the following words are in same page using regex
is it possible to check all the three words ?
Note: i need to check three words in the page 1 or not if not i will check in next page .
Hi @Vaibhav_Rajpoot_17 ,
For this you can split the PDF document page wise and verify if the keywords are mentioned in the splited document one by one.
Thanks
yes you are right @Kalees9486 i already using while loop and i am checking one by one page but the problem is regex here (β\bβ+get_ip+β|β+get_ip2+β|β+get_ip3+β\bβ) i am using pattern in ismatch activity but its giving me page number if any one input match with that page. so i am confused .
This will help you to verify and provide the status for 1 parameter. You have to use 3 different ismatch activity to pass the pattern and input text
Thanks
so it will be like:
ismatch1=(β\bβ+get_ip+β\bβ)
ismatch2=(β\bβ+get_ip2+β\bβ)
ismatch3=(β\bβ+get_ip3+β\bβ)
right? @Kalees9486
then how should i check if condition?
i am using ismatch1.Equals(true) condition to check!
if change pattern to 3 ismatch, how should i change the if condition to check 3 ismatches are true?
you can use a if activity to check all three Ismatch output is true using AND operator like below,
If
IsMatchOut1 And IsMatchOut2 And IsMatchOut3
Then(all three keywords have true)
Log message
else(anyone of the keyword is missing)
Log message
As Ismatch activity will return Boolean data type, you can directly pass those variables in a if activity. If all three variables are true, it will execute then part. Means three keywords are available in a single page, else anyone of the keyword is missing the 1 page document
first extract the text , then analyze the pattern of your string then use regex for extract
First you have to match the pattern for finding these words on specific.
If possible , share the pdf file.
as per your instruction, i did it!
but it satisfying only 2 condition
i.e this is what i given if condition =boo1.Equals(true)and boo2 .Equals(true)and boo3.Equals(true)
but the codition ignored 1st input!!
anyway thanks for that great idea @Kalees9486
@Vaibhav_Rajpoot_17 yes, i did it. but my pattern is like :
ismatch1=(β\bβ+get_ip+β\bβ)
ismatch2=(β\bβ+get_ip2+β\bβ)
ismatch3=(β\bβ+get_ip3+β\bβ)
now if condition checking but ignoring ismatch1 boolean value
i.e: if condition executing this condition: (false and true and true)!
but what i need is that should execute only if(true and true and true).Equals(true)β¦
hi @Veera_Raj
In that case,
how to analyze pattern to match three words?
Computer Programming.pdf (204.2 KB)
I attached example pdf.
in that pdf, i will check for following words
1)Distributed Data Processing
2)Internet
3)DDP
which is in page number 7
Expression -
(regex.IsMatch(TExt_PDF,"(Distributed Data Processing)") and regex.IsMatch(TExt_PDF,"(Internet)")) AndAlso regex.IsMatch(TExt_PDF,"(DDP)")
Find the workflow , I hope it will help you -
Sequence.xaml (9.9 KB)
Thanks @Vaibhav_Rajpoot_17 this expression really worked to check if all three words are true or not! also thanks for sharing workflow.
If your problem is resolved , marked it as solution.