I am trying to extract data from the body of an email and was wondering if Document Understanding is required? I am able to read the email from an Outlook folder. The body is well structured (although certain elements may need to be parsed into multiple fields). At first I thought “document understanding” would apply, but this is just text and maybe some simple string commands would be simpler and cheaper? Please see example below.
You had asked what the output would look like. I am envisioning storing the data elements in an xlx table. This table information will then be used to automate data entry into a corresponding website. Data fields are indicated in most cases by “:” (for example Case ID:
Child’sName/DOB/FSFN Child ID will be delimited by “,” Parent Name: will be broken out by " "
Hopefully these patterns will help. Insert the Regex patterns into a ‘Matches’ activity.
Regex Pattern:
(?<=Child’s Name\/DOB\/FSFN Child ID\n)([^,]+),\s([^,]+),\s((19|20)\d{2}-\d{2}-\d{2}),\s(.*) Preview the results here
Then use the following to convert your results to string:
How to get Group 1 (Childs First Name) results:
INSERTVARIABLE(0).Groups(1).ToString
How to get Group 2 (Childs Last Name)results:
INSERTVARIABLE(0).Groups(2).ToString
How to get Group 3 (DOB) results:
INSERTVARIABLE(0).Groups(3).ToString
How to get Group 5 (FSFN) results:
INSERTVARIABLE(0).Groups(5).ToString
From the Matches Activity, use a write line activity (or an assign activity) and update the capital letters above with the Result from the Matches Activity.
= the 1st match. If you have multiple matches increase this number. Use a “For each” activity to write out all results.