How to write a regex expression in matches activity?

BM0031 PRINCIPLES OF ACCOUNTING
PRINCIPLES OF FINANCIAL MANAGEMENT/BUSINESS FINANCE

Above 2 are the different formats that I want to extract from different word documents how do I write the regex expression and also the input for this ? Would be great to have an example to use or reference to.

Just write the regex into “pattern” property.
by the way, I am not sure what’s your requirement

Hi, my requirement is to extract module syllabus and I want to extract for example the module name and module code in my prev post. They are both in different formats. I want to extract the name as module name and the code as module code and if there is no module code it will return empty.

Two questions:

  1. What is your input?
  2. What do you want the regex to be able to get from that input?
  1. my input is multiple word documents.
  2. The regex must be able to get the different formatting of the same syllabus. e.g. module name and module code. The start of the word documents have this:
    BM0031 PRINCIPLES OF ACCOUNTING
    PRINCIPLES OF FINANCIAL MANAGEMENT/BUSINESS FINANCE
    This 2 are from 2 different word documents and I want to write a regex that reads this 2 different patterns and stores the BM0031 as module code and the name as module name.

Hi @Jovian_Low,

Need more inputs then only we can give the exact regex format.

You can use regex101.com to test the regex

I have inputs like BM0031 PRINCPLES OF ACCOUNTING, PRINCIPLES OF FINANCIAL MANAGEMENT/BUSINESS FINANCE, BM0523 SERVICES MARKETING MANAGEMENT, IT1528 CYBER SECURITY TECHNOLOGY, LAW AND ETHICS, IT3526 Cyber Security Attack & Defense etc… I want to write a regex format that can read all this types.

HI @Jovian_Low,

For Getting Book Code use the below pattern

Pattern -> ([A-Za-z]{2}[0-9]{4})

For Getting all the book name in the line

Pattern -> (?<=[A-Za-z]{2}[0-9]{4}).*?(?=[A-Za-z]{2}[0-9]{4})

Regards,
Arivu

Hi, I tried the book code and the attached images is what i did for my entire program.


This is the first part of the program

Second part of the program

This is my output in excel. Why do I get this and not the book code?


Please help!!

Can u send me the xaml file I ll check and update u

Here it is thanks alot.Regex.xaml (12.7 KB)

“The start of the word documents have this:
BM0031 PRINCIPLES OF ACCOUNTING
PRINCIPLES OF FINANCIAL MANAGEMENT/BUSINESS FINANCE”

I see that in the second case, the module name is not be preceded by a module code. Now, if this is all the input we have, it won’t be a problem to create a regex for this. But, your documents will have much more text than just these names which means the generic regex we create won’t work. And we have to figure out a way to separate the module name and code from the rest of the text.

Please share a sample document which has your input.

Hey @Jovian_Low

Use the For Each Activity with Argument System.Text.RegularExpressions.Match then traverse.

you are directly passing Matches output value to Add Data Row so getting that output.

Regards…!!
Aksh

FYI, assuming module code starts with 2 alphabets followed by 4 digits, ([A-Za-z]{2}[0-9]{4}) this would work just fine.

@aksh1yadav Hi, I am quite new to uipath do you have an example on what you just wrote ?

hi @siddharth I tried it but it doesn;t return me the module code in the excel.

@siddharth I have an attached xaml file above with that pattern in matches activity.

I’ve just downloaded your workflow. I know the problem you might be facing.