Split repeating element


#1

Hi,

Please gimmi some advice or guide me.
Situation is:
I got scraped all site in to one string which i need to cut to smaller element.
Structure is:

-Section 3.1


-Section 3.2
-Section 3.3
-Section 3.4.1
-Section 3.4.4
-Section 3.4.5
-Section 3.2
-Section 3.3
-Section 3.4.1
-Section 3.4.4
-Section 3.2
-Section 3.3
-Section 3.4.1
-Section 3.4.5


-Section 3.5
-Section 3.6
-Section 3.7

I would like to cut this repeating sections between lines but if i using strValues.Split({“Section 3.2”,“Section 3.5”},StringSplitOptions.None) then split it for 3.2 to 3.2 to 3.2 to 3.5 <---- i need this in one string to cut this in a part in loop. The reason is that on the example can be different number of elements in other page this repeating section can be more or less.

Many thanks for help,
Regards
@fudi5


#2

Hi @fudi5,

Store it in string variable ->strValue
listvalue->List(String)
listvalue= strValue.Split(Environment.NewLine.ToArray, StringSplitOptions.RemoveEmptyEntries).ToList()

use Assign activity
listvalue=listvalue.Distinct().ToList()

strValue=String.join(Environment.NewLine,listvalue)

it will remove the duplicate values

Regards,
Arivu


#3

Hi @arivu96,

Sorry my friend but You dont undestand problem. Your solution is useless on this problem. I need all of the string but i need cut out this part which is between line. I dont need to remove duplicated parts… i need them but in one string.
Part1
Part2 with repeating parts which i want isolated…
Part3

I need this Part2 cut out. Thats all and one condition is that it should be in one string.

Regards,
@fudi5


#4

Hi,

If it’s always between 3.1 and 3.5, what if you just split by those?

txt.Split({"-Section 3.1"},System.StringSplitOptions.None)(1).Split({"-Section 3.5"},System.StringSplitOptions.None)(0)

That would give you only the text between those Sections.

There’s a Regex solution to this as well, but was just giving a quick answer. I’m also not sure the solution you need completely, cause there might be more to it than that.

Regards.


#5

THX for help. Ill try this solution.


#6

Sorry, forgot one other thing. Make sure you add a .Trim on the end, so it removes any newline characters. Regards.