Hello everyone
I’m new to this field to use automation
I have around 400 pdf files that want to extract data after the specific text
my data as the following
CTTTT No: 122e3df243
BBBB No: 4234tgdsthr45
NNNNNN Code: 5735735735735
hhhhhhhh Code: 52346234624
in each pdf file this 4-line may be repeated five times or 10 times each one is different from other
but all have the same start
CTTTT No:
BBBB No:
NNNNNN Code:
hhhhhhhh Code:
and want to save this data in an excel file with the name of the pdf file
wish happy night for all
Hi,
Hope the following sample helps you.
mc = System.Text.RegularExpressions.Regex.Matches(strPdf,"CTTTT\s+No:\s*(?<CTTTT>.*)\s+BBBB\s+No:\s*(?<BBBB>.*)\s+NNNNNN\s+Code:\s*(?<NNNNNN>.*)\s+hhhhhhhh\s+Code:\s*(?<hhhhhhhh>.*)")
Then
{m.Groups("CTTTT").Value,m.Groups("BBBB").Value,m.Groups("NNNNNN").Value,m.Groups("hhhhhhhh").Value}
note: mc is MatchCollection type.
Sample20230113-2L.zip (3.3 KB)
Regards,
1 Like
can please advise on
how can add the following data in Regex to get the value after
QTY:
Mfg Date:
Exp Date:
Example
QTY: 12
Mfg Date: 3/1/2021
Exp Date: 6/1/2027
System.Text.RegularExpressions.Regex.Matches(strPdf,“Cat\s+No\s*:\s*(?.)\s+Batch\s+No\s:\s*(?.)[\s\S]+?NNNNN\s+Code\s:\s*(?.)\s+HHH\s+Code\s:\s*(?.*)”)
I try to add in the same manner in the code line but it doesn’t work