naotosx
(Naoto)
August 23, 2018, 8:32pm
1
Hello i’m reading a pdf and put it into a variable called pdfText i’m doing this be cause is not reading the specific amount but if i read a table of my pdf i get this
as you may think i need to extract the amount “Arriendo de Televia” if it exits but i dont know how to do this, all this info is in my variable pdfText that is the result of a get Text Activity , i though that maybe with something like this:
pdfText.Contains(“Arriendo de Televía”) = true
but after this how i extract the exactly amount?
quihan
(Qui Han)
August 23, 2018, 8:43pm
2
You could use Matches activity which is Regex Expression. The easiest pattern that I can think of now is Arriendo de Televia\s\d*.?\d*.
2 Likes
Konrad
(Kasper Larsen)
August 24, 2018, 7:18am
3
i would use matches (?<=Arriendo de Televia\s).*
readPDFText.xaml (12.6 KB)
Also, here is some online regex testers that you can use to understand regex and make your own:
https://regex101.com/
And a fun one:
https://regexcrossword.com/
1 Like
naotosx
(Naoto)
August 24, 2018, 2:09pm
4
Thanks! @Konrad @quihan both are right, it works but i have two question
1 ) how can i check if the result of this Matches Activity was success or not ? i’m asking this be cause when this does not exist y get an error
2 ) if the amount have more than one dot “.” this still work, eg 1.546.567 in this case what would thw way to aboard this ?
I solved this with: "Arriendo de Televía\s\d .?\d .?\d*" and it works for amounts format x.xxx , xxx.xxx and x.xxx.xxx and xx.xxx.xxx**
thanks again!
quihan
(Qui Han)
August 24, 2018, 10:12pm
5
Hi @naotosx , to avoid that error, we can check the Matches count before accessing the variable by using:
Matches.Count > 0 This is to check whether there are any matches. If there is, do the necessary processing.
2 ) if the amount have more than one dot “.” this still work, eg 1.546.567 in this case what would thw way to aboard this ?
To solve this, if you are sure that there will be a Enter space after your text, you can use:
Arriendo de Televia\s. *
If not, you can use:
Arriendo de Televi\s\d*(.?\d*)*
The differences between this and yours is if in future, you have value more than 2 dots. It will still work.
Hope this helps.
1 Like
Konrad
(Kasper Larsen)
August 27, 2018, 8:12am
6
What Quihan said
Updated workflow with the changes:
readPDFText.xaml (17.9 KB)
1 Like
naotosx
(Naoto)
August 27, 2018, 2:06pm
7
quihan
(Qui Han)
August 27, 2018, 8:35pm
8
No worries @naotosx , happy automating
1 Like