How to read the specific data in pdf

Hi
Use document understanding or
did you try Read PDF with OCR Activity with that regex to extract data

@Harish_pavuluri You can use regex if it digitizing the text perfectly

Thnks for reply.

I used read pdf but it is giving full data and i don’t know how to use regex can u explain it.

thks for reply

I am new to uipath and I don’t know for the regex how it help.

hy
check this video - Regular Expressions (Regex) Tutorial: How to Match Any Pattern of Text - YouTube

Hey @Harish_pavuluri

Check out my Regex Megapost designed to help new users :slight_smile:

1 Like

Save the text of read PDF in a variable and use log message activity to print the value of variable (i.e., text of pdf) in output window. Copy the text from there and share it so that appropriate regex can be built to capture the required information.

1 Like

Tnks for reply,
File attached The extracted data marked as xxxxxxxx and excel also attached where i have to save the data
New Text Document.txt (1.8 KB)
Capture

Use the Matches Activity, give the PDF text as input to this activity and apply the regex and store the result in a collection variable. Please see the attached image for extracting the Customer Reference from the PDF text. The result will be stored in a collection variable which will contain all the matches of Customer Reference in the whole PDF text.

To print the all Customer Reference values captured by regex

The properties panel of all activities used:

image

image

hope it helps

Tnks for support,

It’s working fine. But I am un able to save that output in excel.

I tried for the amount to extract hear the screenshot and it is not taking any amount.
Capture1

Hi @Harish_pavuluri - Is it possible to create text file using “Preserve Format” - True in the Read PDF text activity and share it. Also let me know what are the values you are looking to extract…

image

1 Like

For writing output in excel you can use Write Range activity. For that you need to build Data Table and then Add Data Rows by using the For Each Loop.

Can you mention the amount in the text file you shared. If it is missing in the text file then as suggested by @prasath17 set ‘Preserve Format’ to True and then share the text file or try to read the amount.

Hi
The amount screenshot shared and after read pdf text Activity the data saved in to notepad file.
Capture
New Text Document.txt (1.8 KB)

Hi
The excel screenshot is above and notepad file also.
I need the Entry Date,Amount,Order Party & Extra Information.

Try this regex -

((?<=Customer Reference )\w+)|((?<=Entry Date )[\d\/]{10})|((?<=[\d\/]{10} )[\d,]+\.\d{2})|((?<=Order Party )\w+)

For Extra information-

(?<= Extra Information )\w+

@Harish_pavuluri order party is not given for 3rd statement

Tkn for reply.
It is throwing error.