Hello hello,
I have a small problem i coulnd’t figure out yet and i was hopping that someone could point me into the right direction.
So, it goes like this: i have some PDF from which i need to extract some information - they are letters, that contain certain variables, like date, account number and so on. I need to extract these variables and compare them with a data table (the comparing part i can manage
)
Thing is, I have little to no experience on using Regex (i only used it in another small process but that was easy) and i was wondering if you know of any solution where i could make the selection of what i need easier. or any suggestion on what to read, where to look in order to figure it out.
Thanks,
Cristi
1 Like
Hey @Cristian_Ionita
This looks possible.
Kindly share your PDF or its screenshot to understand the format of it so that some way of doing that can be suggested.
Thanks
#nK
1 Like
sensitive information,so i can’t really share the file, but it goes something like this, as an example:
“neachitarea la scadenta a debitului dumneavoastra provenit din contractul incheiat cu Nume si prenume, cu numarul Numar contract, din data de Data contract , pentru care dumneavoastra aveti calitatea .”
The words that are bold are the variables which i would need to extract. Any hint or suggestion is more than welcomed, and i can figure out the rest 
thanks
1 Like
Hi!
May i know which ocr you’re using to extract the data. Can we try with ML activities to extract the data by using Document Understanding?
Have a view on this video: UiPath Document Understanding # 7 | Extract and Validate using ML Extractor | ExpoHub | By Rakesh - YouTube
Regards,
NaNi
1 Like
Hi
As @THIRU_NANI suggested this can be accomplished with Document understanding in a easier way
For a demo have a view on this
Cheers @Cristian_Ionita
2 Likes
Hi!
Try this out:
System.Text.RegularExpressions.regex.match(strVariable,"(Nume si prenume)|(Numar contract)|(Data contract)")
Reference: regex101: build, test, and debug regex
Note: RegEx will only work when we need to extract the pattern data and also exact data
Regards,
NaNi
Hey @Cristian_Ionita
How are you manually deciding these bold words ?
Are those static or dynamic keywords or any other rules…
Kindly confirm.
Thanks
#nK
they are dynamic - sorry if i forgot to mention. Basically, i have a template for a letter that we are sending to our clients and i want to double check the final result of this letter with a data table. in the whole letter, only a couple of variables are changing (the ones that are bold in my example). I don’t know if i managed to explain properly
1 Like
ML is still something that i cannot grasp fully at my level, but thank you both for the suggestion - i’ve put them on my learning list 