RegEx tool confusion

mardoza · December 5, 2023, 9:16pm

Hi team,

I’m new to using RegEx and was looking for help to understand what I’m doing wrong.

Here’s the Regular Expression I created to extract the information around the description of each item from a PDF. However, I can’t share the actual PDF, so I included the text extracted in a doc.

What I don’t understand is why the information I capture on the site isn’t the same as what UiPath captures. Do I need to follow different rules? Am I facing this problem in the wrong way? Should I do it differently?

I have never created an automation like this, so I’m pretty much out of options in my mind.

New folder.zip (27.7 KB)

Steven_McKeering · December 5, 2023, 10:27pm

Hello

There are always multiple ways to solve the problem.

I have created a pattern and sample workflow which you might find helpful. I have had to take a guess at what you wanted to extract.

Sample, Output and Pattern go a long way to help us create a regex pattern.
Preview the pattern here

This will return each item as an individual regex match.

The sample workflow will convert each results into a datatable for you to continue with your project.
Main - Regex-Mardoza.xaml (15.0 KB)

Hopefully you can find my Regex Megapost helpful

Cheers

Steve

Yoichi · December 5, 2023, 11:44pm

HI,

The pattern should be as the following for example, because there seems multiple white spaces.

[0-9]{1,3}\s+[0-9]{3}-[A-Z]{4}|[0-9]{1,3}\s+[0-9]{3}-[0-9]{4}

And, it’s necessary to modify adding datarow part as the following.

Hope this help you.

Regards,

mardoza · December 6, 2023, 2:29pm

Thank you for your response, @Steven_McKeering
The RegEx you created (+ 1 addition I made) solves the issue of capturing the data.

I tried using your code, but it throws the below error.

I’m not sure how to go around it.

mardoza · December 6, 2023, 3:40pm

Thank you for your response, @Yoichi.

Your Regular Expression also helped me, and I tried to use your suggestion. I do not understand the last part of the value you assigned for strValue, so I think that I’m doing something wrong because is giving me an error.

First part of my workflow:

Second part:

And this is the error:

Any suggestion on how to proceed?

Yoichi · December 6, 2023, 11:26pm

HI,

Sorry, my expression is localized for Japan. Backslash is displayed as Yen sign(￥) in Japan locale.
Can you try the following expression? (Please replace Yen sign with backslash)

strValue = System.Text.RegularExpressions.Regex.Split(currentItem.Value,"\s+")(1)

Regards,

mardoza · December 7, 2023, 12:35pm

@Yoichi thank you again for your response.

I changed the Yen sign for a backslash and it worked. However, I’m only getting the values of Group 1. How can I retrieve the values of all the groups in the Regular expression? Not the whole expression in 1 cell but a cell/row for each group, so I can keep working on it to complete my task.

PD: As an idea, I tried something similar to what @Steven_McKeering suggested using a counter and the expression {currentItem(counter).Groups(1).ToString.Trim} in the Add Data Row Activity, but it is giving me this error:

Yoichi · December 7, 2023, 2:10pm

Hi,

In this case, it’s unnecessary to use counter. It will be {currentItem.Groups(1).ToString.Trim}
And it may be better to use Named Group of Regex such as (?<LINENUMBER[0-9]{1,3})\s*(?<PARTNO>[0-9A-Z]{3}\-[0-9A-Z]{4})\s*...... We can get each group like matchVar.Group("PARTNO").Value

Regards,

mardoza · December 7, 2023, 10:22pm

@Yoichi Thank you very much for your help!

I used the Regular Expression and then ran a For Each activity to add all the lines to my data table. Finally, I wrote it in an Excel file. I’m leaving below the expression I used.

{currentItem.Groups("LINEITEM").Value.ToString.Trim, currentItem.Groups("PN").Value.ToString.Trim,.......}

system · December 10, 2023, 10:23pm

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Find specifig word and catch following then send to excel in batch Studio uiautomation	5	1278	July 20, 2021
Using regex to match and extract multiple matches Help	4	9997	July 6, 2019
Extract Specific Info from PDF Something Else feedback	8	1111	January 17, 2022
Help with PDF to Excel process (edited) Help excel , pdf , activities , regex , question	4	1144	January 5, 2020
Regex Activity in PDF file Studio uiautomation	2	979	December 3, 2022

RegEx tool confusion

Related topics