I have 3 different formats of data. I need a regex that matches all three formats. I have created three individual regexes to match each format of data.
Data Format 1 : 01/11/2023 05/03/2021 2 10000000000000000000001 10000000000000000000001 90000 ON/OFF 1 0 $10.22
VISIT NEW
Patient ID / 1000 Kh Wil DOB: 09/27/2023
Name:
Review Message
free tuof
Format 1 regex : (?<service_from>\d+/\d+/\d{4}) (?<service_to>\d+/\d+/\d{4}) \d* (?\d{24}).* (?<est_overpayment>.\d+.\d+).\s.\s.Patient ID . (?<patient_id>\d+) (?[\D\s,]+) DOB: (?\d+/\d+/\d{4})\sName:\sReview Message\s(?<review_message_part3>[\D\s].\n)
Data Format 2 : 05/07/2023 05/11/2023 1 10000000000000000000001 10000000000000000000000 90001 OFF/ON 1 0 $33.10
VISIT NEW
Patient ID / 10001 MIL JS Hn DOB: 5/30/2030
Name:
If you have any questions, please call 888-888-8888
DDDDDD-LetterRef# 1234-120
Page 5 of 163Notice of Preliminary Findings
Review ID: 100001
Service From Service Thru Line Process Modif Process Co Uni New Overpat
Date: Date: # Cla Number Ren Cla Code Code Description Paid Units Amount
Review Message
hkdshvsjf
Format 2 Regex : (?<service_from>\d+/\d+/\d{4}) (?<service_to>\d+/\d+/\d{4}) \d* (?\d{24}).* (?<est_overpayment>.\d+.\d+).\s.\s.\s.\s.\s.\s.\s+Review ID: (?<review_id>\d).\s+.\s.\s.Patient ID . (?<patient_id>\d+) (?[\D\s,]+) DOB: (?\d+/\d+/\d{4}).\s.\s.\s+(?<review_message_part3>[\D\s].\n)
Data Format 3 : 07/01/2013 05/02/2030 1 100022222222222222222222 10003333333333333333333 93333 OF/OU 1 0 $10.22
VISIT NEW
If you have any questions, please call 8333-4333-3333333
DAAAAA-LetterRef# 799999-100
Page 6 of 163Notice of Preliminary Findings
Review ID: 1011111
Service From Service Thru Line Process Modif Process Co Uni New Overpat
Date: Date: # Cla Number Ren Cla Code Code Description Paid Units Amount
Patient ID / 10000 AD Lenr DOB: 3/14/2030
Name:
Review Message
fsdjnkfnsjff
Format 3 regex : (?<service_from>\d+/\d+/\d{4}) (?<service_to>\d+/\d+/\d{4}) \d* (?\d{24}).* (?<est_overpayment>.\d+.\d+).\s.\sPatient ID . (?<patient_id>\d+) (?[\D\s,]+) DOB: (?\d+/\d+/\d{4})\s+Name:.\s.\s.\s.\s.\s.\s.\s+Review ID: (?<review_id>\d).\s+.\s.\s.\s(?<review_message_part3>[\D\s].*\n)
I would appreciate any help to find out the single regex that works for all three formats
Could you Provide us the Regex Expressions in Preformatted Text by using </> button.
It would be more clearer and there wouldn’t be any conversion errors.
Also, Could you also let us know more characteristics of the values you want to extract, so far we can understand the below Characteristics :
1. Service from - Date format - xx/xx/xxxx 2. Service to - Date format - xx/xx/xxxx 3. Est Overpayment - Couldn’t identify the pattern properly 4. Patient ID - Between Patient ID and DOB keywords ? 5. Review ID - Couldn’t identify the pattern properly 6. Review Message - After Review Message Keyword ?
est overpayment is a dollar amount which could have decimal as well
Between patient Id and dob keywords is the name which could include the middle name. For eg. Nina Ken Joy or Nina Joy
Review Id is not actually a required pattern to extract usually it is 7 digits
There is no static keyword after the Review message. The review message could be multiple lines. There is an empty line after the last line of the review message. That marks the end of the review message.