Extract a specific info from text

Amr_Nweery · March 27, 2023, 6:13pm

I have used a read pdf text activity to get info from an invoice. It is required that I extract the word “Professional Services” under (Item description) as shown in the message box. How can I do that? Please help.

supermanPunch · March 27, 2023, 6:21pm

Hi @Amr_Nweery ,

Could you maybe try using the below Expression :

Considering the PDF extracted text is in a variable PdfText, you could use the below and check :

System.Text.RegularExpressions.Regex.Match(PdfText,"Total[\S\s]+\d+\s{2,}(.*)\s{2,}").Groups(1).Value.Trim

If it does work, make sure you perform the test with several other data.

If it does not work, could you provide us with the data text so that we can check match from our end.

Anil_G · March 27, 2023, 6:24pm

@Amr_Nweery

First split the string with total and sub total

str = str.Split({"Total","Subtotal:"},StringSplitOptions.None)(1).Trim - this gives you all the rows in the table you need

now split with new line to get each row as an array…

arrstr = str.Split({"Total","Subtotal:"},StringSplitOptions.RemoveEmptyEntries)

Now on each row perform this regex

System.Text.RegularExpressions.Regex.Match(eachrow,"(?<=\d+\s+)[A-za-z]+ *[A-za-z]*(?=\s+)(?<=\d+\s+)[A-za-z]+ *[A-za-z]*(?=\s+)").Value

cheers

Amr_Nweery · March 27, 2023, 7:54pm

is there any way to do it with data manipulation using substring and split functions?

Anil_G · March 27, 2023, 8:35pm

@Amr_Nweery

Then follow the steps i gave you …then instead of last regular expression use…make sure to read the data with preserve formatting

reqstr = Eachrow.Split({" "},SplitStringOptions.RemoveEmptyEntries)(1)

Inside the split i am splitting on double space…not single

Hope this helps

Cheers

Topic		Replies	Views
Extract Specific text from multiple Pdf's Studio studio , question , activities_panel	4	527	November 21, 2023
Read PDF text, write specific information Help pdf , activities , regex , question , data_manipulation	3	889	November 19, 2019
Get Specific words from a text Studio studio , question , highlight_elements	6	854	April 24, 2023
Get text using Regex Activities pdf , activities , question	7	1005	June 12, 2022
Extract dynamic text from a PDF Studio uiautomation , pdf , activities , data_scraping , string , question	3	1382	September 8, 2020

Most Active Users - Yesterday
Anil_G
ashokkarale
kkpatel
adilhassanpost
yedukondaluaregala
V_Roboto_V
More details...

Extract a specific info from text

Related topics