Extract a specific info from text

I have used a read pdf text activity to get info from an invoice. It is required that I extract the word “Professional Services” under (Item description) as shown in the message box. How can I do that? Please help.

Hi @Amr_Nweery ,

Could you maybe try using the below Expression :

Considering the PDF extracted text is in a variable PdfText, you could use the below and check :

System.Text.RegularExpressions.Regex.Match(PdfText,"Total[\S\s]+\d+\s{2,}(.*)\s{2,}").Groups(1).Value.Trim

If it does work, make sure you perform the test with several other data.

If it does not work, could you provide us with the data text so that we can check match from our end.

@Amr_Nweery

First split the string with total and sub total

str = str.Split({"Total","Subtotal:"},StringSplitOptions.None)(1).Trim - this gives you all the rows in the table you need

now split with new line to get each row as an array…

arrstr = str.Split({"Total","Subtotal:"},StringSplitOptions.RemoveEmptyEntries)

Now on each row perform this regex

System.Text.RegularExpressions.Regex.Match(eachrow,"(?<=\d+\s+)[A-za-z]+ *[A-za-z]*(?=\s+)(?<=\d+\s+)[A-za-z]+ *[A-za-z]*(?=\s+)").Value

cheers

is there any way to do it with data manipulation using substring and split functions?

@Amr_Nweery

Then follow the steps i gave you …then instead of last regular expression use…make sure to read the data with preserve formatting

reqstr = Eachrow.Split({" "},SplitStringOptions.RemoveEmptyEntries)(1)

Inside the split i am splitting on double space…not single

Hope this helps

Cheers