Extracting a portion of a text in a pdf file

I don’t know if it’s the right place to answer this, but I am too hollow-minded to think about the code for extracting a specific text from a pdf file. It is a pure document, not a scanned document nor image, so OCR won’t come into play.

I am eager to know how can I extract text from a pdf file in order to append to an existing text file. The screenshots come pre-attached for your ease.

This is the sequence in question.

And this is the text intending to be extracted from an existing pdf file, and it is on page 3 out of 15.

The steps here were unclear in my intellect for my understanding: https://docs.uipath.com/activities/docs/read-pdf-files#read-a-pdf-file-using-the-read-pdf-with-ocr-activity


You can use Regex,

But you need to make a Regex pattern. To do this,

We need:

  • A sample
  • Expected output
  • Whats consistent



Okay. Since the project’s in a classic view, how can a regex pattern be done? If there is a walkthrough in the docs, please hint me to such.


Take a look here:

All you need is an Assign activity:

Assign Left:

Assign Right:
system.Text.RegularExpressions.Regex.Match(yourStr, “INSERTxREGEXxPATTERN”).ToString

Take a look at this sample regex pattern


You can learn more about regex here:

Hopefully this helps



Will do so. Thanks.

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.