Regular Expression for email data

I need to write out the Regular expression for City Source and City Destination where First Matches record should be 8 and as well as for City Destination.

When I create a regular expression then it shows the 16 matches result but I want to get individual result for City Source and City Destination.


Is it possible ?
@Palaniyappan: Any idea

1 Like

Hi
If possible can I have a view on the expression
Cheers @Dipanshu

1 Like

I am using which is showing both City Source and Destination result.

I want to only fetch matches record individual

Attached file which need to be scraped
email_data1.txt (1.4 KB)

@Palaniyappan: have a look

1 Like

Fine
Hope these steps would help you resolve this
—use a READ TEXT FILE activity and pass the filepath of above text file activity and get the output with a variable of type string named strinput
—now use a Matches activity with input string as strinput.ToString and expression as
”(?<=\d.\d.\d.\s).*(?=[^A-Za-z].)“

This will give us the value of only City Source
Which can be obtained by passing the output variable of matches activity to FOR EACH activity and change the type argument as System.Text.RegularExpressions.Match
—inside the loop use a write line activity like this
item.ToString which will display the value of City Source column alone

And for city destination similarly use a matches activity next to this and mention the same input string hut with expression as
”(?<=\d.\d.\d.\s).*(?=Inv)” and inside the loop with writeline mention as
Split(item.ToString,” “)(1).ToString.Trim

Cheers @Dipanshu

1 Like

@Palaniyappan: Its fine but there is any solution if I get multiple matches record for the same email (Because this is email thread and here same data is available multiple time in the email). Hence which is creating duplicacy for some column. Can we scrap the value for a particular area in the regex or any solution you have.

@Palaniyappan: Can you suggest me which approach should i implemented here?

Yah of course we can
Does that text lies between any strings or text
Cheers @Dipanshu

@Palaniyappan: Yes, This table text lies between string. Using this we can get the correct match through regex.

Right ?

1 Like

Yah we can
@Dipanshu

1 Like

@Palaniyappan
Also I found a solution that we can split the email data. then I can create the regex for the same
item.Body.Split({“Text to be enter”}, StringSplitOptions.None)

I have entered the starting point and now it is fetching complete data from start (which text I entered)
Now I need to define the end text. Can you help me how can I put the end point text. So we can create the regex on the same and matches counter will be matching from the table data

@Palaniyappan:

I need your small help to find out the regex in below data.

Dipanshu Rastogi

12-Sep-2019

India

Delhi

Noida

Nairobi

Air

This above is the scrapped data and I want to fetch only Noida.
I tried with your solution"(?<=\d.\d.\s).*(?=[^A-Za-z].)" but not able to create the regex for Noida text. Please suggest me how can I create the Regex for fetching the Noida value. how can I ignore the Line , so I could get the Noida value in the regex. (Kindly start point in the regex with date format)

Can anyone suggest me ?

I am unable to understand your problem here…If you want to get the fifth item in your string you can use Split using New line and get the value using this code

String.Join("|",System.Text.RegularExpressions.Regex.Split(StringVariable,"\n")).Trim.Split({"||"},StringSplitOptions.None)(4)

Hi @vickydas,

Actually this is the small part of the scrapped data.
Using Regex we have fetched the first value (India) Like ((?<=\d.\d.\s).(?=[^A-Za-z].)). Here I want to know how I can fetch Noida through regex using the same.
In the same regex ((?<=\d.\d.\s).
(?=[^A-Za-z].)) I want to ignore India, Delhi then what changes need to be done in the Regex.

@Palaniyappan: can you suggest me, waiting for your response ?

1 Like

Can you please provide the input string ?
and please tell me what output you need .
@Dipanshu

1 Like

@iVishalNayak

Scraping Data:
Dipanshu Rastogi

12-Sep-2019

India

Delhi

Noida

Nairobi

Air

.
Need regex for fetching Noida with single match

Sure

If “Delhi”
is constant then this would be the solution
(?<=Delhi\s\s)\w*
image

1 Like

Hello @Dipanshu
Use this code in an assign what it does first it’ll match your first pattern and will get the word india (can be any) and using that word it’ll skip delhi and get you noida (can be any word)

Check it out

(System.Text.RegularExpressions.Regex.Match(strrrr,"((?<=\d.\d.\s).*\n).*").ToString+","+ System.Text.RegularExpressions.Regex.Split((System.Text.RegularExpressions.Regex.Match(strrrr, "(?<="+System.Text.RegularExpressions.Regex.Match(strrrr,"((?<=\d.\d.\s).*\n).*").ToString.Trim+"\s\s)\w+\n\s\w+")).ToString,"\n\s+")(1)).Trim

Strrr is string variable

2 Likes