Regex line break problem

Hey everyone,

I’m encountering an issue with regex in UiPath when processing text extracted from a PDF. I’m testing my regex patterns on regexstorm.net, and they work perfectly there. However, when I run the same patterns in UiPath Studio, I get no matches.

Here are the details:

Regex Patterns I’m Using:

SupplierName = "(?<=dat\s)[A-Za-zÀ-ÿ\s\.,&\-]+(?=,\s+alle)"
Date = "(?<=tot\s*\r?\n\s*)\d{1,2}\s+\w+\s+\d{4}(?=\s+heeft)"

Implementation in UiPath:

wka_Lev_Match = Regex.Match(wka_pdf_Output, "(?<=dat\s)[A-Za-zÀ-ÿ\s\.,&\-]+(?=,\s+alle)", RegexOptions.Multiline)

Sample PDF Text (as extracted in UiPath):

U hebt ons gevraagd om een actuele Verklaring betalingsgedrag
 ketenaansprakelijkheid. In deze brief leest u mijn beslissing op uw aanvraag.
 -Beslissing
 Ik verklaar dat LOOHUIS INSTALLATIETECHNIEKEN ALMELO B.V., alle
 loonheffingen over de tijdvakken tot 18 juni 2025 heeft betaald.
 Over deze verklaring
 Deze verklaring gaat alleen over gegevens die tot 18 juni 2025 bij de

It seems that line breaks or spacing in the extracted PDF text might be affecting the match. Has anyone faced similar issues? Is there a better way to handle this kind of sudden linebreaks and multiline content in regex within UiPath?

Greetings,
Stijn

@uiStijn

can you give the same pdf here instead of text

as when we text its getting it properly

there might be differences when we read data directly from pdf

cheers

1 Like

@uiStijn,

Before applying your regex, remove or normalize line breaks and extra spaces. For instance, using a simple replacement such as:
wka_pdf_Output = wka_pdf_Output.Replace(System.Environment.NewLine, " ") or use regex itself to collapse multiple whitespace characters into a single space.

This flattens the text and could bring your extracted string closer to what you expect

1 Like

Thank you @Anil_G & @ashokkarale.

I had do clear my string first of all the linebreaks.

wka_pdf_Output.Replace(Environment.NewLine, " ").Replace(vbCrLf, " ").Replace(vbLf, " ").Replace(vbCr, " ")
2 Likes

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.