Extract data from .txt file using Anchor Base

Hi,

I am new to Uipath. I would like to extract the text next to the word “Name” in a text file. The word “Name” can appear anywhere in the text file, so I tried Anchor Base, but it doesn’t work.

  1. Can someone help troubleshoot? I can’t attach my file as I am new to the Forum. I noted that Anchor Base works well in website, but can’t seem to work in .txt. Do I need to use another method?

  2. Is there some resource that explains all the attributes in the selector?

Thank you

Use regex or string manipulation to get the text

Thanks
@Anonymous2

1 Like

You can use regex expressions to extract the text next to “Name”.

Can you send some example text so we can use regex on it??

Hi @Anonymous2

Try this in assign.

Name (String) =System.Text.RegularExpressions.Regex.Match(variable1,"(?<=Name).+").Value

Sorry I am quite lost. How do I use "regex "?

How about using “Find element” → “Set Clipping Region” → “Get OCR Text”? I find the explanation support on Uipath website to be somewhat lacking. The explanation lacks clarity

An example below:

==

kkkkjj

Name: Mary Tan

Hi @Anonymous2,

At the end of the post I’ve attached one example for your better understanding use it if you need it.
As you said before,

Anonymous2:

The word “Name” can appear anywhere in the text file

So setting clipping region would not be a suitable option here.
We use Regex to search pattern.

I’ll suggest you to follow the steps below.

  1. Firstly drag Read Text File activity at the starting of the sequence.
    Select txt filename ( “Details.txt” ) from which you want to get the name.
    Create String variable called Text1 and pass it to the output parameter of the activity.
  2. Create one Name variable of variable type —> String & Take Assign activity like given below.
Name  = System.Text.RegularExpressions.Regex.Match(Text1,"(?<=Name).+").Value
  1. Take MessageBox with var. Name

image

Here’s one Example (If you need it) —> RegexEx.zip (13.2 KB)

Hi Samir,

Thank you v much - it works! But I would like to understand the syntax used in the Regex formula:
“(?<=Name - ).+”.

How should the Regex formula be for a tabular structure like below?

Variable Sale Price Customer_ID
GH_E 199 32 456
Chr_D 100 12 345

So how do I extract the Customer ID for “Chr_D”?
Thank you!

@Anonymous2

This is LookAhead regex pattern (?<=Name - ) —> it’ll get text next to Name - till .+ end of the line. ( . —> any char & + —> one or more, .+ together it becomes —> any char one or more times.)

When it comes to tabular data we don’t use regex there, we have other method Or logic in DataTable type.

If you want to know anything more about regex LookAhead, Lookbehind, LookAround in depth then click here.

Hi Samir,

Once again, many thanks for yr reply!

If my .txt file has data arranged in a rectangular tabular form, but without all the lines of a proper table what method should I use to extract the data?

The example is given in the previous post:

Blockquote
Variable Sale Price Customer_ID
GH_E 199 32 456
Chr_D 100 12 345

Thank you

Hi

I am able to extract the Customer_ID for Chr_D with
“(?<=Chr_D.{8}).+”

But I have a more difficult issue with another form. I want to extract the line below “COMBINED TEST FOR THE PRESENCE OF CLIMATE CHANGE”. How can I use regex to do so? Thanks!

COMBINED TEST FOR THE PRESENCE OF CLIMATE CHANGE
IDENTIFIABLE CLIMATE CHANGE PRESENT

@Anonymous2,

If you have data in txt as you’ve specified, then use assign like given below,

CustID = (System.Text.RegularExpressions.Regex.Match(Text1,"(?<=Chr_D ).+").Value).ToString.Split(" "C).last

(?<=Chr_D ).+ —> will extract text 100 12 345 but you want customer ID, for that we’ve used split method with " " one space & it’ll take last value. i.e —> 345

For your 2nd issue, take assign as given below

Nextline = System.Text.RegularExpressions.Regex.Match(Text1,"(?<=COMBINED TEST FOR THE PRESENCE OF CLIMATE CHANGE)(\n).+").Value

Note —> Text1 var is Input var.

Hi Samir

Thanks again for yr reply!

For Issue 1, your solution worked.

For Issue 2, I got blank in my message box, when it should be showing “IDENTIFIABLE CLIMATE CHANGE PRESENT”

THank you

Sorry, a query on Issue 1
CustID = (System.Text.RegularExpressions.Regex.Match(Text1,"(?<=Chr_D ).+").Value).ToString.Split(" "C).last

If the extracted string is 100 12 345**, and I want “345” only without the 2 stars at the back, what should the formula be?

Finally, is there some good resource on all these methods for Regax?

Thanks!

Will appreciate it if anyone can reply to my last 2 queries. Thank you.