Extract Words between a string and the end of the line in a multi-line string

Hi,

I have this text and want to extract the text after "Sample Type: " and before the \r character at the end of that line.

HPV PRIMARY SCREENING 

    Specimen Type: Liquid based cytology (SurePath)

    HrHPV: Not detected

                              CYTOLOGY 
    Site:  Cervical

    The specimen is satisfactory for evaluation.

    NEGATIVE FOR INTRAEPITHELIAL LESION OR MALIGNANCY.

    The next HPV screening test should be taken in 5 years, based on the
    NCSP Register history. Overseas tests are noted on the request form
    but are not recorded in the NCSP Register. This recommendation is
    based on current test results and the NCSP Register records only and
    may need to be modified if other results were reported overseas.
    Please forward copies of overseas pathology reports or an overseas
    specialist letter confirming dates and results of previous pathology
    tests to the NCSP Register.

I have tried this regex in the Find Matching Patterns activity [(?<=Specimen Type: \s+)[\S\s]?(?=\r)] but this matches every line in the string apart from the last line.

Want I want is for the activity to return ‘Liquid based cytology (SurePath)’ to the output string.

Any help is much appreciated

Hello

Try this pattern - preview the pattern here:
(?<=Specimen Type:).+

It will return all text until the end of the current line. We can capture and trim the result in the assign activity below.

Use it in an assign activity like this:
Left assign
str_Result

Right assign
system.Text.RegularExpressions.Regex.Match(yourStr, “(?<=Specimen Type:).+”).ToString.Trim

Some feedback on your provided pattern. A good attempt with the right approach but with a few mistakes:

  • There was an extra space after the colon ‘:’ (meaning nothing was going to match unless there was a 2+ spaces).
  • The square brackets matching all whitespace and non whitespace characters would match everything
  • The quantifier on the square brackets ? was incorrect as it mean ‘0 or 1’ character. You should have used a ‘+’ as it meant you are expecting 1 to infinite.
  • Another way of writing your intended pattern with less characters is like this.

You can check out my Regex megapost if you want to see few more examples.

Cheers

Steve

1 Like

Thanks, Steve. That’s brilliant.

I had an issue with the " characters in
system.Text.RegularExpressions.Regex.Match(yourStr, “(?<=Specimen Type:).+”).ToString.Trim
where UiPath required me to delete them and re-enter them - somehow it does not recognise the ones copied from the forum message and reports this error:
“Assign: Expression Activity type ‘VisualBasicValue`1’ requires compilation in order to run. Please ensure that the workflow has been compiled.”

Once I got past that, it’s all going okay.

1 Like

FYI, it’s known matter as the following post.

And the problem with your original pattern seems to be the extra white space at the red arrow in the following image.

Regards,

1 Like

Good pickup. Yeah the quotations marks are troublesome sometimes when copy pasted.

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.