Scraping information from text document

GoldCutlery · December 6, 2019, 4:09pm

Hello all. I cannot share screenshots for security reasons and all data given is dummy test data

I am working on a process that will take a .rtf file, gather specific information and store all of it into a Data Table

Test Data:

Family Name BAGGINS
Given Name BILBO
some other irrelevant information
Family Name TOMLINSON
Given Name BABETTE
more irrelevant information

The goal is to extract every instance of Family Name into an array and then again with Given Name.

Currently I have the .rtf file stored as a String using Word Application Scope.
I have then tried to use UiPath.Core.Activities.Matches with a RegEx that works fine for extracting the data. However, I am having a hard time working with the output type of this activity as I need to store all data gathered into a Data Table and if for example, Family Name has more elements than Given Name, I run into issues with adding the Data Rows in a later Loop.

I then tried using a String Split, I have it currently as an assign

String[] Test = RTF.Split({"Family Name", "Given Name"}, StringSplitOptions.RemoveEmptyEntries)

This works in isolating the required data but also leaves my array with too many elements that are not usable (all the irrelevant information).

Is there a way to either remove entries that are irrelevant in my Split method or easily create a DataTable using multiple outputs from Matches?

KannanSuresh · December 7, 2019, 6:42am

Use the regular expression below.

((?<=Family Name ).*)\n(Given Name) .*

This should give you Family Name in the first group and Given Name in the second group. From the second group, you have to remove the constant string Given Name

Click here to see the regular expression in action.

Manikandasamy · December 7, 2019, 7:06am

@GoldCutlery

Use Add data Row activity and type the below in the Arrray row.
{DataRegex.Groups(1).ToString, DataRegex.Groups(2).ToString}

Note : This will help u to store the regex group in Array, do reply for more help.

Topic		Replies	Views
Filter Datatable row with Regex Help studio , question	2	3567	July 16, 2021
Extracting Table data from the Screen scrapped text Studio regex , query , screen-scraping	13	916	September 6, 2022
Extract Ienumerable <match> object to a data table Studio	2	1980	August 27, 2020
Regex to extract part of the text, Please advice Activities datatable , uiautomation , reg	6	1202	February 21, 2022
Building Datable with RegEx Studio	7	734	September 14, 2020

Most Active Users - Yesterday
Anil_G
ashokkarale
jinal.shah
Gautham_Pattabiraman
postwick
chandreshsinh.jadeja
vrdabberu
Ajay_Mishra
sven.wullum1
Vyshnavi_Nalumachu
More details...

Scraping information from text document

Related Topics