Extracting a portion of a text in a pdf file

eduardo34 · March 5, 2023, 9:49pm

I don’t know if it’s the right place to answer this, but I am too hollow-minded to think about the code for extracting a specific text from a pdf file. It is a pure document, not a scanned document nor image, so OCR won’t come into play.

I am eager to know how can I extract text from a pdf file in order to append to an existing text file. The screenshots come pre-attached for your ease.

This is the sequence in question.

And this is the text intending to be extracted from an existing pdf file, and it is on page 3 out of 15.

The steps here were unclear in my intellect for my understanding: https://docs.uipath.com/activities/docs/read-pdf-files#read-a-pdf-file-using-the-read-pdf-with-ocr-activity

Steven_McKeering · March 5, 2023, 10:07pm

Hello

You can use Regex,

But you need to make a Regex pattern. To do this,

We need:

A sample
Expected output
Whats consistent

Cheers

Steve

eduardo34 · March 5, 2023, 10:07pm

Okay. Since the project’s in a classic view, how can a regex pattern be done? If there is a walkthrough in the docs, please hint me to such.

Steven_McKeering · March 5, 2023, 10:14pm

Hi

Take a look here:

All you need is an Assign activity:

Assign Left:
str_Result

Assign Right:
system.Text.RegularExpressions.Regex.Match(yourStr, “INSERTxREGEXxPATTERN”).ToString

Take a look at this sample regex pattern

You can learn more about regex here:

Hopefully this helps

Cheers

Steve

eduardo34 · March 6, 2023, 12:59am

Will do so. Thanks.

system · April 2, 2023, 9:31pm

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How To Extract Data From PDF Using 'Read PDF Text' And RegEx ? Knowledge Base activities	0	290	August 8, 2023
How to read the specific data in pdf Activities pdf , activities , question	33	4380	June 2, 2021
PDF Extract Text Studio pdf , studio , question , system , activities_panel , pdf-extraction	6	1229	June 30, 2022
Extracting Multiple Text from a PDF Studio excel , pdf , studio , question , activities_panel	2	204	January 13, 2024
PDF particular data Activities pdf , activities	7	323	May 8, 2023

Most Active Users - Yesterday
ashokkarale
MD_Farhan1
Ajay_Mishra
postwick
Dheerendra_vishwakarma
Anil_G
chandreshsinh.jadeja
Gautham_Pattabiraman
vrdabberu
aravindbalineni123
More details...

Extracting a portion of a text in a pdf file

Related Topics