REGEX Pattern for "200/14-27-961-19W5/0 ",“1Z0/03-25-061-24W5/0”

Hello,

I have multiple PDF attachments which i am downloading from Outlook and reading through all one by one and finding a particular number/Pattern -“100/14-27-061-19W5/0”.
I tried using Substring by taking position of preceding and succeeding words, it works for only a particular set of pdfs having same pattern and containing those succeeding/preceding words but fails for others.
Basically that pattern can be anywhere in pdf and can have dynamic preceding/succeeding words.
So, i though of using Regex, please can you help me with the regex syntax for the above pattern.

I have tried implementing the logic based on discussions from similar topic but it’s not working.

"200/14-27-961-19W5/0 ",“1Z0/03-25-061-24W5/0”
Everything is dynamic in the pattern except that length will be 15 and will have a W as shown.

Thanks in advance.

@Faraz_Subhani use below pattern. It may be helpful.

pattern - “[\w]{3}/[\d]{2}-[\d]{2}-[\d]{3}-[\w]{4}/[\d]{1}”

Hey!
Thanks for your immediate response.
I am trying your solution in matches-Pattern but i don’t understand why i am getting such kind of output for all the pdfs which the bot is reading and trying to find out the matching pattern.
image

May be i am doing some stupidity.

@Faraz_Subhani Did you import System.Text.RegularExpressions in the import panel

Hello,

Thanks for your reply, for which control you want me to import.
image

I did it for the for each loop, i am not getting any import option for Matches.
Also, I did System.Text.RegularExpressions.Match.
I am yet to test as my flow is broken but i think i am getting confused as to from where to import.

@Faraz_Subhani Check below link to import

Thank You,
That was really stupid of me, it was right in front of my eyes but i am still getting the same output in message box for all the pdfs as above snapshot, i even restarted Ui path as well as my system.:-(:pensive:

Hello,
I am stuck with this, I am using “/[1]{3}[/][0-9]{2}[-][0-9]{2}[-][0-9]{2,3}[-][0-9]{2}[W][0-9]{1}[/][0-9]{1}$/” pattern to extract “100/14-27-061-19W5/0”. kind of patterned number and fetch wherever i get this pattern in pdf files in loop but it fails.
I tried hard coding this number and checking whether my pattern works or not and it works but when i am changing the number to “abch100/14-27-061-19W5/0”, it fails.
So , basically how to change the pattern so that it extracts the number even if there is something after or before that(from bulk text which the bot is reading form pdf).


  1. 0-9 ↩︎

@Faraz_Subhani Try below pattern. It will work.

pattern - “[\w]*/[\d]{2}-[\d]{2}-[\d]{3}-[\w]{4}/[\d]{1}”

Please first list as many as different numbers available in in ur pdf, then try to write general pattern for all things at once.

Hi,
Thanks for your response , i tried your pattern but i don’t know why i am getting this error when i am trying to display the matched value for the pattern in a message box.
image
I think it has to something with the data types, as the matched value for pattern is stored in IEnumerable whereas i am trying to display as string.

I am attaching my work flow, please assist.Pattern.xaml (8.8 KB)

@Faraz_Subhani try below code, i have made some changes.

Pattern.xaml (10.0 KB)

Superb!! That works perfectly.
Thank You for being patient and helping the throughout:-)

@Faraz_Subhani its ok no problem

Hi @Manjuts90

Thanks again for your solution on pattern finding, that is working well but now i am facing some other issue.
My bot is reading one .pdf at a time, and it is extracting the matching string according to the pattern, for instance if a pdf has X that is matching the pattern, it extracts that X and if the pdf has 4 X’s it is extracting 4 times the same string and all four are same(that is correct according to logic) , however if there are duplicates, i want to extract only one i.e the distinct one and not copies, if it is not duplicate , it can extract X as well as Y.
I am not able to implement the distinct feature, it doesn’t seem to work or i am doing something wrong.
Please can you help me with it.

@Manjuts90 Give distinct in for each loop as highlighted in image.

Capture

Hi,
I tried the above solution, it still extracts the duplicates also and not the distinct ones:-(

Apart from using distinct, is there any solution which we can apply to the regex pattern itself so that it will extract only the unique ones.I tried one “[\w]*/[\d]{1,2}-[\d]{1,2}-[\d]{2,3}-[\w]{4}/[\d]{1,2}?<!\1[\s\S]\1”, the bold part are the ones which i have added but even that doesn’t seem to work.

Any suggestions please, i am really stuck with this.

@Faraz_Subhani Check this below link

Hi @Manjuts90, I already checked that link, if you see in my previous comment, i tried to suffix *<?!\1[\s\S]\1) after my regex pattern, it doesn’t seem to work.