Scraping email text - issues getting correct email info

Hello there,

I’m scraping customer information from emails and have been using string manipulation in order to get the customer’s first and second names, as well as their email.

An issue I am facing is that some times customers include their middle names too and so I scrape their middle name as their last name and last name as their email.

Eg: I have coded for the three fields below (first/last/email)
Customer Details: John Smith Johnsmith@email.com

But if the customer writes:
Customer Details: John Steven Smith johnsmith@email.com

Then I get Steven as the surname and smith as the email.

Is there a simple way I can get the bot to tell if there is an email address or a third name to tell that it is the surname? Perhaps something using the @ in email as an identifier?

Thanks for your help, sorry I’m new!

Hi @dr1992,

you may try using regex to retrieve the email-ID from the Mail by sending the email body as an input to the regex.

Thanks,
Sethu

Hi @dr1992

Please try this,

After split if the count is 3,

After split if the count is 4,

Thanks

1 Like

Hi @dr1992,

Please refer the screenshot which I have posted which will retrieve only the email address from the text using REGEX.

Thanks,
Sethu

Hi @dr1992

This is the output of the flow execution and you can able to notice in the output panel, where the value is fetched properly.

Thanks,
Sethu

Hi Sethu,

Thanks for this; I understand RegEx would get the email, but then how would the bot know the difference between if a middle name/last name exists or not?

Hi @dr1992 ,

I understand your doubt, in email the format would be username@gmail.com. We have set the regex to retrieve only the email id by selecting Email option in the dropdown of the RegEx Builder. So, this RegEx will only retrieve the email id from the text.

If you need to retrieve the lastname and middle name, then there is another regex to retrieve those.

Thanks,

Sethu

I think you misunderstand; sometimes the customer will add a middle name and sometimes they won’t.

I know the regex witll work for email, but then how will the bot understand that there is a middle name or not to then assign the last name?

@dr1992 have you tried my method?

Thanks

I haven’t yet - how would the bot know which scenario is which? Is there something in place to determine which would be middle and last name?

@dr1992 yes we have switch and based on the condition it will execute the corresponding case,

how would the bot know which scenario is which

We have split the email string by using space and count the result.

John Smith Johnsmith@email.com - if we split it its result is 3

John Steven Smith johnsmith@email.com - if we split it its result is 4

based on this we execute the switch case.

Please refer the screen shots i provided earlier.

Thanks

I managed to get this working with explanation, thank you!

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.