Word document scraping for specific text

Hi,

I am trying to read a text document using Word Activities. I am having a trouble extracting specific text from the output of Read Text Activity. The word document is quite big but it looks like this:

Meeting Information
a
aName:  ABCTestaID:  123456a
aDate(s):  XX July XXXXaProject # (if applicable):  0987654

I am trying to extract the values of Name, ID, Date(s) and Project #. All these fields are separated by bullets.

I created this regex expression (?<=\a)(.*?)(?=\a) to get all the text between the bullets but a is not supported in UiPath RegEx Builder. Is there any other way?

Thanks for help !

Thanks,
Pallav

1 Like

Bullet is shown as ‘a’ in the above post. In actual it is a round bullet.

1 Like

@pallav_aggarwal Welcome to uipath community
Can you share what output you expecting from the above string

I want 4 strings as an output which are corresponding to Name, ID, Date(s) and Project # (if applicable). Output i am expecting is:

  1. ABCTest
  2. 123456
  3. XX July XXXX
  4. 0987654

Note: All the fields are separated by round bullets and i cannot hard code to just get this put. The word document has many more such fields.

@pallav_aggarwal Use below regex

For Name

For ID

Can you share some other example string to write regex for Date and project

As i mentioned earlier, i cannot use multiple regex expressions. Here is how the document looks like. This Regex is working for me. i can iterate using for each and then using split, i can get expected output but the problem is the word document contains round bullets. The regex expression is working here but in UiPath regex builder, we can’t use bullet point (it is not recognizable). One way might be to replace the bullets with some other character but I am not able to use replace function for replacing bullet symbol.

@pallav_aggarwal, did you ever solve this? please share solution.