Correct split string

Hello everyone!
I have string

1 - vdaefad 28-9686-87-09 efwefefw 212 14 w424
2-3 jkawnnfno 12849182009 jefoe9nfiwnewn
sefcewfw 4-5 efqwffffffffffffffffffffffeqdqwd311235235 c 32535
ekfmqiefmpqiefmpqie 8308wjeifew 93203 iowefjoiqef
6-7 717 emqfemfoefm 7210
30239 wijfipejp-0-9nn
8-9 717 84202kdkdi ieofj9230804s sfmi
fsjienigw09209
10-13 313 jwirgjwrigjjrwpogpog49029 8240284
nlsnfiefnqei84984 2389284 14-15 30139103951 jenneofneofnoue2898r 3o28rujfeif02
2048204ujf
г.ewfiemp8302830 r823jacm 8038rujo
16-17 aijiqwfiqj932932-3 ejpeikqpeir 2930ri ewkflmf
oifoij429028842 284028 0239ielmkf"

I split and get array of string.
But I need get rows, whis start of digit in bolt(it’s page numbers)

Help me? pleeeease

Pages can be 3 digits also

1/ In what for you get this data? From a file? As result of OCR?
2/ Could you supply expected result?

Cheers

Yes, from file with OCR
I need separate rows:
1 - vdaefad 28-9686-87-09 efwefefw 212 14 w424
2-3 jkawnnfno 12849182009 jefoe9nfiwnewn
sefcewfw
4-5 efqwffffffffffffffffffffffeqdqwd311235235 гр 32535
ekfmqiefmpqiefmpqie 8308wjeifew 93203 iowefjoiqef
6-7 717 emqfemfoefm 7210
30239 wijfipejp-0-9nn
8-9 717 84202kdkdi ieofj9230804s SFMI
fsjienigw09209
10-13 313 jwirgjwrigjjrwpogpog49029 8240284
nlsnfiefnqei84984 2389284
14-15 30139103951 jenneofneofnoue2898r 3o28rujfeif02
2048204ujf
г.ewfiemp8302830 r823jacm 8038rujo
16-17 aijiqwfiqj932932-3 ejpeikqpeir 2930ri ewkflmf
oifoij429028842 284028 0239ielmkf "

Like this?

1 - vdaefad 28-9686-87-09 efwefefw 212 14 w424 :end:

2-3 jkawnnfno 12849182009 jefoe9nfiwnewn :end:

sefcewfw :end:

4-5 efqwffffffffffffffffffffffeqdqwd311235235 гр 32535 :end:

ekfmqiefmpqiefmpqie 8308wjeifew 93203 iowefjoiqef :end:

Or like this?

1 - vdaefad 28-9686-87-09 efwefefw 212 14 w424 :end:

2-3 jkawnnfno 12849182009 jefoe9nfiwnewn sefcewfw :end:

4-5 efqwffffffffffffffffffffffeqdqwd311235235 гр 32535 ekfmqiefmpqiefmpqie 8308wjeifew 93203 iowefjoiqef :end:

Cheers

Yes,
as you stated above

Did you conside read CSV with space as delimiter?
https://docs.uipath.com/activities/docs/read-csv-file

It’s pdf, not csv

Which option?

What?

Hi @J0ska,

2nd solution, you have showed above…

As this…
1 - vdaefad 28-9686-87-09 efwefefw 212 14 w424 :end:

2-3 jkawnnfno 12849182009 jefoe9nfiwnewn sefcewfw :end:

4-5 efqwffffffffffffffffffffffeqdqwd311235235 гр 32535 ekfmqiefmpqiefmpqie 8308wjeifew 93203 iowefjoiqef :end:

Hi @RPA3 ,

Use can assign an array of string to this:
Regex.Replace(content.Trim(), @"^((1 - )|\d+-\d+)", "#$1", RegexOptions.Multiline).Split('#', StringSplitOptions.RemoveEmptyEntries)

  • content.Trim() is your string
  • ^((1 - )|\d±\d+) is the pattern to search the page numbers
  • #$1 is the replacement that adds the ‘#’ before each page number, then we can split by it.

If you don’t want the page number just remove the $1 of the replacement string.