I have used a read pdf text activity to get info from an invoice. It is required that I extract the word “Professional Services” under (Item description) as shown in the message box. How can I do that? Please help.
Hi @Amr_Nweery ,
Could you maybe try using the below Expression :
Considering the PDF extracted text is in a variable PdfText
, you could use the below and check :
System.Text.RegularExpressions.Regex.Match(PdfText,"Total[\S\s]+\d+\s{2,}(.*)\s{2,}").Groups(1).Value.Trim
If it does work, make sure you perform the test with several other data.
If it does not work, could you provide us with the data text so that we can check match from our end.
First split the string with total and sub total
str = str.Split({"Total","Subtotal:"},StringSplitOptions.None)(1).Trim
- this gives you all the rows in the table you need
now split with new line to get each row as an array…
arrstr = str.Split({"Total","Subtotal:"},StringSplitOptions.RemoveEmptyEntries)
Now on each row perform this regex
System.Text.RegularExpressions.Regex.Match(eachrow,"(?<=\d+\s+)[A-za-z]+ *[A-za-z]*(?=\s+)(?<=\d+\s+)[A-za-z]+ *[A-za-z]*(?=\s+)").Value
cheers
is there any way to do it with data manipulation using substring and split functions?
Then follow the steps i gave you …then instead of last regular expression use…make sure to read the data with preserve formatting
reqstr = Eachrow.Split({" "},SplitStringOptions.RemoveEmptyEntries)(1)
Inside the split i am splitting on double space…not single
Hope this helps
Cheers