Hi All, I am using regex extractor to extract data from pdf, I am extracting data of field “Amount”, it is working fine but it is extracting the value twice as the field Amount” exists twice in pdf. Could you please suggest what we can do to not extract it twice?
Thanks!
Hi @Shivi
Modify the regular expression used in the Regex Based Extractor. If the current regular expression extracts both values, adjust it to extract only one of the amounts. Then it will extract only one amount
Hope it helps!!
I have set it to extract only once, however it’s still extracting the value twice.
Can you share the input data, then I’ll try to write the Regular expression… @Shivi
Okay @Shivi
Can you try giving the below regular expression in Regex based extractor,
(?<=Invoice Number\s*Amount Due \(INR\)\s*Rs.)[\d\.\,]+
Hope it helps!!
I tried it, but It’s not working.
Can you try the below one… @Shivi
(?<=Invoice Number\s*Amount Due \(INR\)\s*.*\s*Rs.)[\d\.\,]+
Hi @Shivi,
First, extract the text between two “Amount Due (INR):”, then use string manipulation or regex to get the needed text.
Regex: “(?<=Amount Due (INR):)([\S\s]*)(?=Amount Due (INR):)”
Hi @Shivi ,
Could you show us the output representation of the value getting extracted twice ?


