"Invoice No. 221944 \r\n \r\nAleea Rozelor 84\r\nIasi, Romania\r\n \r\nDate: 2017-08-23\r\n \r\nVendor: Star Software\r\n \r\n \r\nClient Name \r\nACME Systems Inc. \r\nSomewhere Road 59, \r\nBucharest, Romania \r\n \r\nNotes\r\n \r\n \r\nInvoices must be paid within 20 days starting with the issue date.\r\n \r\n \r\nItem Description Quantity Price Per Total \r\nProfessional Services 1 37412.4 EUR 37412.4 EUR\r\n \r\n \r\n \r\n \r\n \r\n \r\n \r\nSubtotal: 31177 EUR\r\n \r\nTax: 6235.4 EUR\r\n \r\n \r\nTotal: 37412.4 EUR\r\n \r\n \r\n "
here is my read pdf activity output now I want to extract Invoice No., Date, Total so how can I get it?
bcorrea
(Bruno Correa)
February 3, 2020, 2:41pm
2
for the invoice no. a simple regular expression like: \d+
will work.
for the date: (\d{4}-\d{2}-\d{2})
for the total: (?<=Total: )(\d+\.\d{1})(?= )
1 Like
chenderson
(Cary Henderson)
February 3, 2020, 2:50pm
3
To expand on bcorrea’s answer, you can use the “Matches” activity in the Programming → String section of the activity panel.
See attached screenshots for details.
bcorrea
(Bruno Correa)
February 3, 2020, 3:25pm
4
but this would return a lot of matches and he would be kind of lost… so he would be better with assign activities like this:
Dim m As Match = Regex.Match(value, "\d+", RegexOptions.IgnoreCase)
1 Like
chenderson
(Cary Henderson)
February 3, 2020, 3:34pm
5
Personally I would use the following expression to get the value between “Invoice No.” and the space following the invoice number:
(?<=Invoice No. ).*?(?=\s)
bcorrea
(Bruno Correa)
February 3, 2020, 3:35pm
6
you could but you dont need, cause there is no other numeric value there without decimal positions…
chenderson
(Cary Henderson)
February 3, 2020, 3:38pm
7
That’s very true for the example string they provided, but it’s useful for this solution to be posted as well in case someone else comes along with a similar question and has a more robust data set.
This way we’re not making any assumptions about the number formats following the invoice number.
bcorrea
(Bruno Correa)
February 3, 2020, 3:40pm
8
and also that would get something like:
Invoice No. NOT FOUND \r\n \r
so could be bad in some cases too…
1 Like
chenderson
(Cary Henderson)
February 3, 2020, 3:41pm
9
That could definitely be the case if an invoice number has a space in it.
The regex would definitely need to be modified if that were the case.
1 Like
system
(system)
Closed
February 6, 2020, 3:41pm
10
This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.