Regex based extractor extracting all value from all the pages of the pdf

I am using Document Understanding’s Regex Extractor to extract “total” from multiple pdf’s. All the pdf’s have multiple copies of the same page which I cannot remove. Trouble is that the regex extractor is extracting all the instances of “total” from all the pages. So if the pdf contains 3 pages, then 3 duplicates of “total” are returned. I only want the value extracted from the first page and not other pages. Any help is appreciated!

Hi @shrey.shah

Try with below expression

System.Text.RegularExpressions.Regex.Match("Inputstring","Your Pattern").Groups(1).Tostring

Groups(1) → It will return the First value from the group

Regards
Gokul

1 Like

Thank you @Gokul001 for your time. I am using Regex extractor from the Document Understanding package so where do I insert the regex you mentioned? Sorry but I am new to uipath.

Hi @shrey.shah

Have a look on the thread

Have a look on the document

Regards
Gokul

Ok I understood where to input the expression (in the advanced field right?). The field that I want to extract is total amount which has a pattern as “Rs. xxxxxx”. So in the expression you provided, “Your Pattern” will be substituted with “Rs.” and “Inputstring” will be substituted with?

Hi @shrey.shah

I’m not able to get you. Can you share the screenshot

Regards
Gokul

Hi @Gokul001
This is the field I want to extract!

Screenshot 2022-03-09 104213

This is the regex builder where I am typing the regex you mentioned!

Hi @shrey.shah

In the Value Just give the Regex Patten

Regards
Gokul

Hi @Gokul001
That is what I was doing previously as shown below:

But this is returning value from all the pages of the pdf. I only want the value from the first page!

If possible share the Input @shrey.shah

@Gokul001 By Input you mean the pdf files?

Yes @shrey.shah

Hi @shrey.shah

Have you try with selecting the option in the drop down

Check → SingleLine

Regards
Gokul

1 Like

@Gokul001 Yes it is working now. So Singleline basically extracts only the value from the first page?

Great @shrey.shah

Only for the particular Regex pattern you can use Singleline

If your Query is resolve Kindly clos this topic by marking solution. So it will help for others too.

Regards
Gokul

@Gokul001 Thanks a lot!

Great @shrey.shah

Happy Automation

Regards
Gokul

@Gokul001 Sorry to disturb again but if I select the Singleline option, then along with the amount it is also extracting other details in the page as shown below:

I tried limiting the characters but then it again extracts the value from all pages even with Singleline selected:

Hi @shrey.shah

In this case use use string manipulation to extract the particular amount.

Can you share the data after extracting from the regex extractor.

Regards
Gokul

@Gokul001 I have uploaded the image of the data extracted for both the scenarios (Singleline+no character limit) and (Singleline + character limit) in my previous reply