How to extract complex data using regex?

08/11/2020          MY HOME                    Mortgage Company                                                           TRU
                       MORTGAGE
   07/29/2020          AXYPN/TJX DC                   Miscellaneous                                                              TRU
   06/09/2020          PRBCS/CARD                    Credit Card                                                                TRU
   06/09/2020          TEST CARD                    Bank Credit Card                                                           XPN  ```

What would be the regex to get the ff. data , I want to get the data after those dates but it should not include the mortgae company , credit card etc. this data are not fix. Should I consider extracting after the date ? I find it hard to get those data. Any idea would be a great help

Also I have to consider the spaces. Thank you

expextected output :

MY HOME MORTGAGE                
AXYPN/TJX DC                     
PRBCS/CARD
TEST CARD

Hi,

Hope the following helps you.

Probably we can achieve it using regex (?<=^\s*\d{2}/\d{2}/\d{4}\s+)\S.*?(?=\s{2}) with Multiline option.

Sample20200828-1.zip (2.3 KB)

Regards,

I was able to manage to extract data from the PDF and here is the ouput which I log using writeline which includes spaces

What would be the regex to get the ff. data , I want to get the data after those dates but it should not include the mortgae company , credit card etc. this data are not fix. Should I consider extracting after the date ? I find it hard to get those data. Any idea would be a great help

Also I have to consider the spaces. Thank you

expextected output :

MY HOME MORTGAGE
AXYPN/TJX DC
PRBCS/CARD
TEST CARD

I cant open the file you send

Ive tried your regex but it cant get the output based on this example , it should conside the space

08/11/2020          TEST HOME                    Mortgage Company                                                           TRU
                       MORTGAGE

OUTPUT SHOULD BE : TEST HOME MORTGAGE

cause the word MORTGAGE is on the nextline

Hi,

I think it seems difficult to achieve it with single regex. So I wrote some logic as the following. Can you try this?

Sample20200828-1v2.zip (2.7 KB)

Regards,

2 Likes

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.