ayeo22
(ayeo22)
November 16, 2022, 2:24pm
1
Need help on this.
I already use Read PDF Text and output it to a text file shown below.
I need to extract
the name (in this case it’s TAN AH KOW) but this name is different for different letters and also different lengths.
the date (in this case it is 16 Nov 2022 but will be different for different letters)
How can I do that?
Here is the content of the text file:
Ref NO: 1234 5678
Oct 18, 2022
TAN AH KOW
Senior Supervisor
ABC COMPANY
Dear member,
ACCOUNT
We checked your account, and $100 had already been credited to your account on 16 Nov 2022. You may refer to the attached bank statement for more information.
Gokul001
(Gokul Balaji)
November 16, 2022, 2:33pm
2
HI @ayeo22
In the text whether Senior Supervisor
is the static one
If yes you can try with this regex expression
System.Text.RegularExpressions.Regex.Match(YourString,"\S.*(?=\nSenior\sSupervisor)").Tostring
Output -> TAN AH KOW
Regards
Gokul
ayeo22
(ayeo22)
November 16, 2022, 2:35pm
3
Thanks for the prompt reply.
Unfortunately, the Senior Supervisor is also not static, can be Manager, Senior Manager, etc.
ayeo22
(ayeo22)
November 16, 2022, 2:36pm
4
Date output I need the 16 Nov 2022 which is always in the same location
Gokul001
(Gokul Balaji)
November 16, 2022, 2:38pm
5
HI @ayeo22
For Date you can try with this expression
System.Text.RegularExpressions.Regex.Match(YourString,"\b\d{2}\s\S{3}\s\d{4}\b").Tostring
Output ->16 Nov 2022
Gokul001
(Gokul Balaji)
November 16, 2022, 2:40pm
6
HI @ayeo22
You can try with this expression for Name
System.Text.RegularExpressions.Regex.Match(YourString,"(?<=\b\S{3}\s\d{2},\s\d{4}\b\n\n)\S.*").Tostring
Output -> TAN AH KOW
ayeo22:
Ref NO: 1234 5678
Oct 18, 2022
TAN AH KOW
Senior Supervisor
ABC COMPANY
Dear member,
ACCOUNT
We checked your account, and $100 had already been credited to your account on 16 Nov 2022. You may refer to the attached bank statement for more information.
Hello @ayeo22
Try this
To get date
System.Text.RegularExpressions.Regex.Match(YourString,"(?<=account\son\W)[\dA-Za-z\s]+").Tostring.trim
To get Name
System.Text.RegularExpressions.Regex.Match(YourString,".*(?=\WSenior\WSupervisor)").Tostring.trim
ayeo22
(ayeo22)
November 16, 2022, 2:44pm
8
Thanks, I will come back tomorrow after trying this.
Need to settle something urgent now.
Thanks again
try this
System.Text.RegularExpressions.Regex.Match(YourString,".*(?=\WSenior\WSupervisor)|.*(?=\WSenior\WManager)|.*(?=\nManager)").Tostring.trim
ayeo22
(ayeo22)
November 16, 2022, 3:22pm
10
There could be many different positions besides Senior Supervisor, Manager and Senior Manager.
Hope this works for no matter what position it is
Gokul001
(Gokul Balaji)
November 16, 2022, 3:29pm
11
Gokul001:
You can try with this expression for Name
System.Text.RegularExpressions.Regex.Match(YourString,"(?<=\b\S{3}\s\d{2},\s\d{4}\b\n\n)\S.*").Tostring
Output -> TAN AH KOW
Have you tried with this expression ? @ayeo22
ayeo22
(ayeo22)
November 16, 2022, 3:32pm
12
Sorry, I have to try tomorrow and let you know
Forgot to mention that the Name can also be of different lengths for different members e.g. Mohammed Subramaniam bin Ashok Kumar, etc.
Gokul001
(Gokul Balaji)
November 17, 2022, 4:34am
13
This will also extracted by this expression
Kindly check this and let me know the status @ayeo22
system
(system)
Closed
November 24, 2022, 4:50am
14
This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.