SYDNEY - NEW YORK
alot of dynamic text 1
NEW YORK - SYDNEY
alot of dynamic Text
NEW YORK - LONDON
alot of dynamic Text
LONDON - SYDNEY
alot of dynamic Text
I need to extract connection with alot of dynamic text under it.
This is one of those times my first question would be: can we get the data in a better format? Always analyze and improve your processes, if possible (maybe it isn’t), rather than automating bad processes.
You can achieve what you need using a Regular Expression (Regex) You can learn Regex from my MegaPost. I would strongly recommend looking at Section 5.
You have provided a Sample, expected Output but no information on the Pattern.
To make a reliable Regex Pattern you must understand the pattern within the text.
What is consistent? What changes?
Is it always capitals and a dash?
How is it generated? System or OCR?
Will there be an opportunity to validate the result? (You could create a white-list of all the Cities in the world maybe).
The good news is I have created a Regex Pattern based upon your sample. However it may not be perfect You may want to ask a few questions and see what you can find out
I have deduced that the pattern is something like this:
MUST be the start of a line.
MUST finish at the end of a line.
MUST only contain capital letters and spaces separated by a dash.
If it doesn’t have ALL these things then IT WILL NOT MATCH.