Advanced Regex over multiple lines

Hi all, I’m hoping some clever chap or chapess can solve what is probably a simple task but is driving me mad. I am reading PDF invoices and trying to extract the customer name to use as the filename. I have an expression that works in RegEx Builder using test text, but when the file is run it returns null.

Sample text:
"@"Invoice

Date Invoice #

01/01/2022 23

Bill To Terms 21 days
ACME Company Ltd
Unit 2016
The Residences

Description Amount"

In the example above I want to return “ACME Company Ltd” and in the RegEx Builder I can achieve that with the following expression, which (I think) asks it to match anything between " days" and "Unit " ignoring new line returns in between.

(?<= days\n)[\S\s]*(?=\nUnit )

As I say, it works in the builder but not when I run it - any help is very much appreciated.

Personally I’ve had problems with newlines inside of lookarounds so I try to avoid it.
Could you try the following regex?

(.*)\n(?=Unit)

image

1 Like

As an alternate ungreedy spanning also over more lines:

(?<= days)(.|\n)*?(?=Unit)

3 Likes

390926.xaml (4.9 KB)

1 Like

Thank you so much for the responses!

_abso - your solution worked perfectly for the example given, but I do have cases where the customer runs over multiple lines.

Peter - yours does the trick over multiple lines - good man!

I owe both of you a beer…

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.