Hi team,
Wish you very Happy New Year!
From the attached Image taken from a PDF, I am trying to extract the Value of Bill To Cost Centre / Project Code: but I am facing problems using Regex because when I read the PDF to text the lines become like this " Bill To Cost Centre /
Invoices will be raised
1512451256
Project Code
manually and submitted to "
And I need to only extract the number 1512451256 Matching with the string Bill To Cost Centre / Project Code, how to ignore this string "Invoices will be raised "
Thank you!
Regards
Max
1 Like
Fine if the input is in a variable named strinput
Then output will be of type string with a variable in assign activity
strinput = String.Join(ββ,strinput.Split(Environment.NewLine.ToArray()))
stroutput = System.Text.RegularExpressions.Regex.Match(strinput,β(?<=Bill To Cost Centre\s\W).*(?=Project Code)β).ToString
Then a final assign activity like this
stroutput = System.Text.RegularExpressions.Regex.Match(stroutput.ToString,β(\d)+β).ToString
Cheers @mc00476004
1 Like
Palaniyappan:
(\d)+
Thank you very much @Palaniyappan , I understand this, but what if the value is not a digit?
1 Like
Usually bill to cost venter will be digits only right
Or do we have a chance to get alpha numeric value
@mc00476004
Yeah, @Palaniyappan , In one of the pdf itβs alphanumeric, and it can start either with a digit or a Character.
1 Like
Manish540
(Manish Shettigar)
January 1, 2020, 6:58am
7
Whether this number β1512451256β will be of constant length?
If its of constant length then you can use,β\d{10}β
1 Like
Hi Manish, I will send you the PDF separately, and No that value is not constant and it is alphanumeric.
Thank you for your help!
Awesome in that case we can use this expression in common that would get both numeric and alpha numeric at the last assign activity
stroutput = System.Text.RegularExpressions.Regex.Match(stroutput.ToString,β\d+|[0-9A-Z\W]+β).ToString
Cheers @mc00476004
Manish540
(Manish Shettigar)
January 1, 2020, 7:08am
10
You can use below code,
System.Text.RegularExpressions.Regex.Match(YourString,β\d{10}β).ToString
Check with this.
Palaniyappan:
\d+|[0-9A-Z\W]+
Thank you @Palaniyappan , This Pattern \d+|[0-9A-Z\W]+ is Matching some other characters as well
I have modified this , can you please check if the below is okay and if there will be any errors?
1 Like
Kindly include this in your expression
[0-9]+|[0-9A-Z\W]+\d+|[0-9A-Z\W]+
Cheers @mc00476004
2 Likes
Kindly let know for any queries or clarification
Cheers @mc00476004
system
(system)
Closed
January 4, 2020, 9:19am
14
This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.