Regex Issue with dynamic Text

Scenario1
Sudberry, MD M1K 5E3 Terms: Net 30 Days
PO # 24043422 Req #/ Ref # 09/12/2020
VISUAL INSPECTION @ 2119 COLDLIN CRES. ETOBMCUAY ** BRIDGE CONTACT
STEVESISNET** .QIC’W "3‘99”“

Scenario2
peterborough, ON M1K 5E3 Terms: Net 30 Days
P@ # 24043422 Req #/ Ref # 09/12/2020
VISUAL INSPECTION @ 2119 COLDLIN CRES. ETOBMCUAY ** BRIDGE CONTACT
STEVESISNET** .QIC’W "3‘99”“

Output : 24043422
Net 30 Days is constant

1 Like

Hello @vboddu,

Will the ouput number count will remain same??
@vboddu

Maybe something like this could work:

(?<=#\s?)(\d+)(?=\s?Req)

And like this if you are sure there always will be spaces before and after the number:
(?<=# )(\d+)(?= Req)

1 Like

yes shiva it will be 8 digit always

1 Like

if the number is constant you can try this!

image

Pattern : \d{8}

Cheers
@vboddu

Well you can try the above mentioned method!

let me know if it works for you or not.

@vboddu

There can be other 8 digits in the text

2818052284 .
(a) INVOICE#: 15935780
’ INVOICE DATE: 31-Jul-20
Sudberry, MD M1K 5E3 Terms: Net 30 Days
PO # 24043422 Req #/ Ref # 09/12/2020
VISUAL INSPECTION @ 2119 COLDLIN CRES. ETOBMCUAY ** BRIDGE CONTACT
STEVESISNET** .QIC’W "3‘99”“

1 Like

Well in that case can you try this??

Pattern : (?<=#).[0-9]{8}

cheers
@vboddu

Shiva, can we have based on Net 30 days. As if ocr couldn’t read # as it sometimes fail it might be issue. the constant text will be Next 30 Days

Hi @vboddu

Yes. Based on “Net 30 days” it is possible.

Just tell me one thing that the 8 Digit No you want to Extract will be always on NextLine of “Net 30 days” ??

And Does that NextLine will contain only one 8 digit no ??

If Yes is the ans for both the above ques then below is the workflow for the same :-
MainPratik.xaml (7.9 KB)
text.txt (184 Bytes)

If the data format is gonna be same always then below String Manipulation is used to retrieve the Next Line of “Net 30 days”

readTextFile.Substring(readTextFile.IndexOf("Net 30 Days")+"Net 30 Days".Length).Split(Convert.ToChar(vblf))(1)

image

After that on that Line the Regex is used as shown below :-

image

Output :-
image

Mark as solution and like it :slight_smile:

Happy Automation :raised_hands:

Best Regards
Er Pratik Wavhal :robot::man_technologist:t4: :computer:

1 Like

Sure you can try smthg like this
Scenario 1
image

Scenario 2
image

Scenario 3
image

Pattern : (?<=Net 30 Days)(.|\n)*(?=Req)

Now you can get this output and you can trim and get the 8 digits.

Cheers

1 Like

it will 8 digit for sure but there can be also other 8 digit at some other place

Hi @vboddu

Yes. It is ok that if at some other Line Except the one Line as shown below i was asking :-

PO # 24043422 Req #/ Ref # 09/12/2020

If at any other Line also any 8 Digit no is present then also it will not affect the workflow solution that i have provided.

You can have a try if you want. Is will always give the 8 Digit No from the next Line of “Net 30 Days”

Mark as solution and like it if my solution helps you :slight_smile:

Happy Automation :raised_hands:

Best Regards
Er Pratik Wavhal :robot::man_technologist:t4: :computer:

Hi Pratik,

it fails for above text. The expression i need the first 8 digit no in the very next line after Net 30 Days

Hi @vboddu

Instead of copying the text here can you give me in text file i.e notepad

Happy Automation :raised_hands:

Best Regards
Er Pratik Wavhal :robot::man_technologist:t4: :computer:

Hi @vboddu

It is perfectly working for me.

Below is the workflow attached for the same :-
MainPratik.xaml (7.9 KB)
text.txt (251 Bytes)

I haven’t changed anything. Just copy pasted the Text you gave to me in the text file and Run the workflow.

I have attached also the same workflow above. Nothing changed. You may have a look.

Happy Automation :raised_hands:

Best Regards
Er Pratik Wavhal :robot::man_technologist:t4: :computer:

1 Like

Hi @vboddu

https://drive.google.com/file/d/1HPk4NNh0DmEkcauAIsvEbAaeFC-ccLoV/view?usp=sharing

I have attached the Screen Recording video for the same. You may check it. It is working fine for me for any set of text even the above text about you are saying.

Happy Automation :raised_hands:

Best Regards
Er Pratik Wavhal :robot::man_technologist:t4: :computer:

Hey @vboddu

Seems like you are using OCR to read this document thus the inconsistencies.

Try this pattern:
(?<=Net \d\d Days\s*).{3,6}(\d{8})

Or Try this:
\d+(?= Req)

1 Like

This worked pratik. will the same will work with Days instead of Net 30 Days