I have this text i am getting from PDF using uipath pdf extraction,
Problem
Usually i get pattern together meaning names will be together , Review Messages will be together. As you can see the text below is from a tabular view in pdf and the name “JASON DIANNA JOHN” is not together unlike “JAY, BAY” , also “Review Message(s):”, instead of text together like “Documentation does not support and billing error of inpatient could have been billed as outpatient” its broken in between other data i need, DOB 2 different location.
Data I need
Ex: from the first row of data
Patient Id: 521178201
Name: JASON DIANNA JOHN
DOB: 10/5/2002
Review message: Documentation does not support medical necessity
Service From Date:02/09/2017
Service thru Date:05/30/2015
Claim number:100050010111901715802222
This is something i started i am getting stuck at the name since name is broken up
Patient ID \/ Name:\s+(?<id>\d*) (?<name1>.*)\s+.*DOB:\s(?<dob>\d+\/\d+\/\d{4})
So how to get the broken up data , using name 1 , name 2,review msg1 and 2 then join? I do need this is group instead of individual.
Patient ID / Name: 521178201 JASON, Sex: Female Patient Account # :
DIANNA JOHN DOB: 10/5/2002
Service From Service Thru Claim Number: Review Message(s): Documentation does not support
Date: 02/09/2017 Date: 100050010111901715802222 medical necessity
02/10/2020
Patient ID / Name: 310976610 JAY, BAY Sex: Female Patient Account # :
DUAA DOB: 7/2/2007
Service From Service Thru Claim Number: Review Message(s): Documentation does not support
Date: 02/10/2013 Date: 10006004125561531161888 medical necessity
05/30/2015
Patient ID / Name: 666310555 Anie, Baby DOB: 4/15/2016 Sex: Male Patient Account # :
Service From Service Thru Claim Number: Review Message(s): Documentation does not support
Date: 03/10/2010 Date: 100055530201666962948521 medical necessity
03/22/2014
Patient ID / Name: 222333136 Anu, Json DOB: 1/15/2012 Sex: Female Patient Account # :
Service From Service Thru Claim Number: Review Message(s): Documentation does not support
Date: 05/04/2011 Date: 100020030201504215275522 and billing error of inpatient could have
11/22/2012 been billed as outpatient.
All data is Dummy data , not real data