Extract the Specified data from Text file

I need to extract the specified values from the input text file and need to update the extracted values into output text file. Can someone guide me please how to extract it

below is the input text file data, in that we need to extract 1440 and 1080 values, but beside the values there is some spaces those are not static, some times its more space and some time its less spaces. Hence I need to extract in dynamic way.

Any leads in helping in extraction is much appreciated.


<< date >> 28/11/2023 invoicenumber : AB 430538/00 PAGE : 1

Customer : REACH DATE :28/11/2023 ARRIVAL H :12.45

AB 11002282 - IN STOCK BY :AD DEPARTURE H :18.50

S4*inbound slip sheet - 40ft
!---------------------------------------------------------------------------------------------------------------------------------!
PRODUCT T PALLET #CU DESCRIPTION #PAL BB DATE QUALITY BATCH DELIVERED ORDERED DIFF REMARKS
PAL ISATION BLOCKING CASES PIECES PIECES
!---------------------------------------------------------------------------------------------------------------------------------!
AB S228827 1 30,4 6 6x75CL J /Crk Reserve 12 L3258 1440 8640 15120 -6480
Shiraz 14.5%
AB S228827 1 30,3 6 6x75CL J /Crk Reserve 12 L3258 1080 6480 15120 -8640
Shiraz 14.5%
!---------------------------------------------------------------------------------------------------------------------------------!


please find the exact input

Hi @sreene26

Try the below regex:

(?<=[A-Z]{1,}\d{4,}\s*)\d*(?=\s*\d{3,}\s*\d*)

=> Use this regex in Find Matching Patterns Activity. Input will the text which is saved in an variable.
=> Store the output in a variable. It will be of DataType IEnumerable(Match).
=> Run a for Each loop for the variable stored and print the currentitem.

Hope it helps!!

Hi @Parvathy
which ever you provided that one is not working after copied the data from text file
see the Screenshot and also I have attached the input file please find below

please find the input file
input.TXT (2.4 KB)

Hi @sreene26

Try using Regexr website https://regexr.com/.

Regards,

Hi @Parvathy

I was tried which working only https://regexr.com/ not worked in https://regex101.com/

However, in Studio also its not working

Hi @sreene26

Try this Regex Pattern:

"(?<=[A-Z]{1,}\d{4,})\s*\d*(?=\s*\d{3,}\s*\d*)"

Regards,

Hi @sreene26

Can you try this

(?<=L3258\s+)\d+

Hi @Parvathy

I have tried with original file but its not extracting those values(refer below screenshot), such as 1440,1080,81,8,1

I tried in lot of ways but its not extracting
now Im attaching original file please find
Original file.TXT (3.0 KB)

HI @sreene26

Try this regex:

(?<=[A-Z]{1,}\d{1,}\s+)\d{1,}(?=\s*\d{2,}\s{1,}\d*)

Hope it helps!!

Hi @Parvathy

its working thank you so much, but can you explain the once, what I understand please find below
image


1 will be the letter KHT
2 will be the digit 6874
3 will be the space between the KHT6874 1
4 will be the 1
5 will be the before 1 spaces
6 ?
7 ?
8 ?
please let me know

Hi @sreene26

May this explanation will make you clear about the given regex. I have broke the regex into parts to make it understandable.

  1. (?<=[A-Z]{1,}\d{1,}\s+): This is a positive lookbehind assertion (?<=) that matches a position in the string where there is a sequence of uppercase letters ([A-Z]{1,}) followed by one or more digits (\d{1,}) and one or more whitespace characters (\s+). This means it looks for a specific pattern before the actual match.
  2. \d{1,}: This matches one or more digits. It’s the main part of the pattern that you’re capturing.
  3. (?=\s*\d{2,}\s{1,}\d*): This is a positive lookahead assertion (?=) that matches a position in the string where there is a sequence of optional whitespace characters (\s*), followed by at least two digits (\d{2,}), then one or more whitespace characters (\s{1,}), and finally, zero or more digits (\d*). This means it looks for a specific pattern after the actual match.

If you find solution for your query please mark it as solution to close the loop.

Happy Automation
Regards,

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.