Hello,
An email address is contained between angle brackets email@host.domain, so that I can extract it based on such format. But is there a way to be able to get email address when the string comes with no brackets at all?
I used to have .split and .regex functions. The first one isn’t that robust (always looks for <>), the latter I might be using wrong but at least I can associate it with a keyword ‘From:’.
Now, can anyone suggest please some string vba expression to provide as value to my variable?
You can make your Regex more precise right?
Use an expression to match
- the special character @
- Should contain ‘.’
- can contain any character before and after ‘a’ till a space is encountered to separate it from other words
- Look for characters like <> or “” that might contain the email enclosed inside it
Regex will be a perfect solution.
Use System.Regex.RegularExpression.Match(stringinput, regex)
Define your regex as:
[^<|].@...*[^>]
This will match any email:
1.may or may not be enclosed between <>
2.contains @ and .
Use this RFC
(?:[a-z0-9!#$%&'*+/=?^_
{|}~-]+(?:.[a-z0-9!#$%&'*+/=?^_{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])
Reference : RegexValidation
Source 2: RailRoad Diagram
@Lakshay_Verma thank you for your suggestion, wondering why I can’t use .Split to get text in between “FROM:EMAIL@HOST.DOMAINSUBJECT:” ?
I try as I might different combinations of .split but none of them gives me just an email, do you have any idea?
You want to use split for FROM:EMAIL@HOST.DOMAIN ?
Then you can try as “FROM:EMAIL@HOST.DOMAIN”.split(“:”)(1)
Hope this helps you
Thanks
try this
“FROM:EMAIL@HOST.DOMAINSUBJECT:”.split(":"c)(1)
@Lakshay_Verma thanks, i tried it and subject remains appended to domain name, I can get rid of it by .replace(“SUBJECT”). Wondering, is there any other way around?
There is a way out, if your domain is specific like .com/.org
Is that a case ?
It does deal with mentioned domains, yes
Except it would be better to have rather robust mechanism to extract client’s emails of a different format, .replace helps cut an unwanted part, considering recent tests it’s ok, thanks
Removing a partial part is little tricky, only way to do is using ML otherwise best to move with replace as it fit your purpose.