Hi,
How do I extract a sentence, starting with a keyword and ending with a keyword, in pdf using uipath?
For example, this is my pdf.
“h1bv2jb32fio2h3gi23. 3ebwgubwregoerg. gwegwe. Today is a good day to exercise and play some games with my friend. hbqwoiuehfweg. gergerger.gergerg”
the starting keyword is “Today” and the ending keyword is “friend”.
How can I make use of these keywords to extract the sentence out?
Currently I am using this for starting with keyword -
Keyword = “Today”.
All_Text.Substring(All_Text.IndexOf(Keyword)+Keyword.Length)
You can try this regex
txt = “h1bv2jb32fio2h3gi23. 3ebwgubwregoerg. gwegwe. Today is a good day to exercise and play some games with my friend. hbqwoiuehfweg. gergerger.gergerg”
Regex.Match(txt, “(?=Today)([\S\s]*)(?<=friend)”).Value
Thanks you very much. This worked out.
One more question is there a way to extract out the property address, includes address line 1, address line 2 & address line 3 (maybe), however, they don’t have a fixed position in the pdf.
address line 1 = 12 abc street
address line 2 = #01-01
address line 3 = abcd
1st scenario:
iwbegregeriug
property address
12 abc street
#01-01
abcd
postal code
2nd scenario:
regkuerhgoier
postal code
12 abc street
property address
#01-01
abcd
Sometimes, there is no address line 2 and sometimes there is no address line 3.
thank you very much
Hi @anonymous3 ,
address = “iwbegregeriug
property address
12 abc street
#01-01
abcd
postal code”
Regex.Match(address, “(?<=property address)([\S\s]*)”).Value
It will extract only the text after property address.