Regular expression to handle dynamic strings

Lalitha_Selvaraj · June 21, 2024, 12:10pm

Hi,
I’m extracting text from multiple pdf files and single character in text file keeps changing. Is there any regular expression to handle this?
Eg. text file 1 - string appear as “Insurance policy amount” (small L)
text file 2 - string appear as “Insurance poIicy amount”(Capital i)
text file 3 - string appear as “Insurance policv amount”.
I’m retrieving the details from text file and save it in excel. If I want to retrieve “Insurance policy amount” how to handle this?

vrdabberu · June 21, 2024, 12:16pm

Hi @Lalitha_Selvaraj

Can you provide a sample text file.

Regards

Lalitha_Selvaraj · June 21, 2024, 12:42pm

Hi @vrdabberu . Please find the sample text file.
insurance.txt (129 Bytes)

vrdabberu · June 21, 2024, 12:51pm

What should be extracted from these 3 strings:

Insurance PoIicv Amount: $1,200,000.00
Insurance PoIicy Amount: J25 1 000 i00_
Insurance Policy Amount: ##5##Q00##0 __

String1 output?
String2 output?
String3 output?

Regards

Lalitha_Selvaraj · June 21, 2024, 1:01pm

I have a field in pdf “Insurance amount” and I need to extract the value of insurance amount. pdf image

While extracting pdf text into text file. Field name (Insurance amount) is changing as mentioned earlier. Is there any regular expression to handle all the type of field names.
Expected results is the value available near Insurance policy amount.

vrdabberu · June 21, 2024, 1:14pm

Hi @Lalitha_Selvaraj

Please try the below regex expression:

(?<=\.\s*[A-Za-z ]+\:\s*)(.*)

Regards

indra · June 21, 2024, 1:19pm

@Lalitha_Selvaraj

You can use this regex to get policy amount

Lalitha_Selvaraj · June 21, 2024, 2:21pm

It’s Working. but i have another field like “Amount” so its taking those values too.

Lalitha_Selvaraj · June 21, 2024, 2:22pm

It’s not working @vrdabberu. Can you give some other regex if possible.

Lalitha_Selvaraj · June 21, 2024, 2:25pm

The word “Insurance” and “Amount” is constant. only the word “policy” keeps changing.

vrdabberu · June 21, 2024, 2:26pm

Hi @Lalitha_Selvaraj

Is the above expected Output?

Or is the below one the expected output:

Regards

Lalitha_Selvaraj · June 21, 2024, 2:31pm

vrdabberu · June 21, 2024, 2:31pm

Hi @Lalitha_Selvaraj

Try it in Regexr website:
Pattern: (?<=\.\s*)[A-Za-z ]+(?=\:\s*)

Regards

Lalitha_Selvaraj · June 21, 2024, 2:37pm

It’s working. but this is very general. Is it possible to add “Insurance” and “Amount” keyword in the regex to make it more specific.

indra · June 21, 2024, 2:37pm

@Lalitha_Selvaraj Share all input string and expected output

vrdabberu · June 21, 2024, 2:40pm

Hi @Lalitha_Selvaraj

Try this then:

Insurance[A-Za-z ]+Amount

Regards

Lalitha_Selvaraj · June 21, 2024, 2:46pm

Sorry for the confusion. I need the right side value(highlighted ones) with “insurance” and “amount” constant regex.

vrdabberu · June 21, 2024, 2:51pm

Hi @Lalitha_Selvaraj

(?<=\.\s*Insurance[A-Za-z ]+Amount\:\s*)(.*)

Regards

Lalitha_Selvaraj · June 21, 2024, 3:50pm

Great. Its working @vrdabberu. Thanks

vrdabberu · June 21, 2024, 3:53pm

You’re welcome @Lalitha_Selvaraj

Happy Automation !!

Topic		Replies	Views
Regular expression - dynamic strings Activities uiautomation , activities , studio , pdf-extraction	41	242	June 28, 2024
Regular expression - Strings Activities uiautomation , activities , studio	6	115	June 27, 2024
Extract dynamic text from a PDF Studio uiautomation , pdf , activities , data_scraping , string , question	3	1387	September 8, 2020
Issues with Regex Pattern in Capturing Dynamic Text Fields Studio uiautomation	12	653	December 30, 2023
How to capture Numeric value from string Studio uiautomation	8	405	February 26, 2023

Regular expression to handle dynamic strings

Related topics