How to Take between the Matches?

Hi Guys
As marked in the Screenshot , Need to extract the whole data from the PDF Files.

Note : This is the sample file ,original has more than 67 pages with other data on the top and bottom

Suggest any solution with regex or some other methods
Regex.txt (4.6 KB)

Regards
Gokul

Hi @Gokul001

Are we extracting the data from pdf?

If yes then Document understanding will be a Reliable approach!

Regards

Hi @pravin_calvin
Can you suggest apart from DU

(?s)start_Word(.*?)last_Word

you can use this . for last one I think you can use date format regex , Test it in regex101.com

2 Likes

Hi Gokul,

Did you tried with generate datatable activity and try some delimiter(space and newline) to extract the text data into datatable. Since with regex it will be too complex to extract each and every data from whole file. Please try and let us know. Thanks.

1 Like

I will check and update you @kirankumar.mahanthi1

Hi @Vijay_RPA

I think this will solve my query. i will check it.

Hi Gokul,

Sorry earlier I didn’t understand fully your requirement. If you want bunch of data between dates and you don’t need specific values from the content. You could go with regex and ignore my suggestion. Thanks.

Please make it as solved . If it works :slight_smile:

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.