Best way to extract a field from a chunk of text?

I have a pdf document translated into raw text via Read PDF Text. I want to extract a particular field in the middle of the raw text


Some text here
Some text here
Buyer Address:
(Variable text to be extracted)
Seller Address:
Some text here
Some text here

Currently I am using two String.Split activities… It works but it is not very elegant. What is the method to use one Substring activity instead?

Regex is the best (regular Expressions)

a simple one could be like Buyer Address (.*) Seller Address

What’s the full line of code to extract the field?

using Assign

Buyer_AD_AR = Regex.Matches(Output_Variable,“Buyer Address (.*) Seller Address”)

Where Buyer AD_AR is of type System.Text.RegularExpressions.MatchCollection

(make sure namespace of regular expression is imported in your project to use the above)

Or you could use the matches activity

How do I extract the String which is the Output_Variable?

or do I need to convert System.Text.RegularExpressions.MatchCollection to String?

Output_Variable is your variable from Read PDF Text

Do I use a .ToString method on the MatchCollection type?


refer this post

Arivu :slight_smile:


once you have the value in Buyer_AD_AR, again use assign for

Buyer_AD = Buyer_AD_AR(0).Groups(1).ToString

Where Buyer AD is type String

I remember one of the practice exercises used only one Assign activity with a Substring… what was the method used there?


Please use an assign activity, str_buyer_address= System.Text.RegularExpressions.regex.Match(your_PDF_Text,"(?<=Buyer Address:)[\s\S]*(?=Seller Address:)").ToString
It will return all the text in between “Buyer Address:” and “Seller Address:” as string.

Warm regards,

1 Like

(see latest post)

Ok how do I modify it to extract text within brackets?

Service fees (Jan 2017 to Jun 2017)

I want to extract only the part which says “Jan 2017 to Jun 2017”

Does it work with brackets as well? Again, need this to work within one Assign activity.

Ok I am using this expression in a different workflow, now it says “Regex is not declared. It may be inaccessible due to its protection level”. How do I resolve this?

import the namespace in your project


How do I import namespace?


This should help:

I am now encountering the following error:


Source: Assign

Message: Specified argument was out of the range of valid values.
Parameter name: i

Exception Type: System.ArgumentOutOfRangeException

An ExceptionDetail, likely created by IncludeExceptionDetailInFaults=true, whose value is:
System.ArgumentOutOfRangeException: Specified argument was out of the range of valid values.
Parameter name: i
   at System.Text.RegularExpressions.MatchCollection.get_Item(Int32 i)
   at lambda_method(Closure , ActivityContext )
   at Microsoft.VisualBasic.Activities.VisualBasicValue`1.Execute(CodeActivityContext context)
   at System.Activities.CodeActivity`1.InternalExecuteInResolutionContext(CodeActivityContext context)
   at System.Activities.Runtime.ActivityExecutor.ExecuteInResolutionContext[T](ActivityInstance parentInstance, Activity`1 expressionActivity)
   at System.Activities.InArgument`1.TryPopulateValue(LocationEnvironment targetEnvironment, ActivityInstance activityInstance, ActivityExecutor executor)
   at System.Activities.RuntimeArgument.TryPopulateValue(LocationEnvironment targetEnvironment, ActivityInstance targetActivityInstance, ActivityExecutor executor, Object argumentValueOverride, Location resultLocation, Boolean skipFastPath)
   at System.Activities.ActivityInstance.InternalTryPopulateArgumentValueOrScheduleExpression(RuntimeArgument argument, Int32 nextArgumentIndex, ActivityExecutor executor, IDictionary`2 argumentValueOverrides, Location resultLocation, Boolean isDynamicUpdate)
   at System.Activities.ActivityInstance.ResolveArguments(ActivityExecutor executor, IDictionary`2 argumentValueOverrides, Location resultLocation, Int32 startIndex)
   at System.Activities.Runtime.ActivityExecutor.ExecuteActivityWorkItem.ExecuteBody(ActivityExecutor executor, BookmarkManager bookmarkManager, Location resultLocation)

This happens when I want to retrieve the string from the MatchCollection… I am sure that the text file I am searching contains the start delimiter, end delimiter and some text in between. What might be causing this? Yesterday I tried it and it worked fine, I only encountered this today

First activity: Regex.Matches(text,“Start text (.*) End Text”)
Second activity: x(0).Groups(1).ToString

Check your Regex pattern on sites like

Put your full string there and the pattern and you can confirm whether the pattern works or not

In case things in your output string changed (like space between words/positions) the regex will fail. So you need to design regex in such a way that it can handle all possible shifts in the string as well (at least all expected based on tests)

How to modify the regex to include line breaks? (.*) the dot matches any character except line breaks