Rename PDF after searching for the name inside the PDF file

Hi - I would like help with renaming a file, depending on the rule I set, regarding searching for specific words in the same specific place in every PDF. To clarify I have two examples below:

  1. For example, the PDF shows a name in the 2nd line after the word “Name” within the PDF, is there a formula to make it extract the words after “Name” and rename the file with that result?
    Date: 24/03/2022
    Name: Amenoufy
    Subject: Thank you
    In this case the file will be renamed “Amenoufy”

  2. Can I set rules to say, if after “Subject” it says “Thank you”, name the file with the words after
    “name” & add the result “Thank you”. In this example it will rename the file: AmenoufyThankyou

Thanks in advance!

Hi @amenoufy ,

Hope you are already using PDF Activities to read the PDF file as a String. Then you could make use of Regex Expressions to get the value of Name like Below :

System.Text.RegularExpressions.Regex.Match(pdfTextStr,"(Name:).*",RegexOptions.IgnoreCase).Value.ToString.Trim

Here pdfTextStr variable is the Output String from Read PDF Text Activity.

As per the Condition mentioned, we could also do the same using Regex by Checking if the two words are present in the Output String using Expression as Below :

System.Text.RegularExpressions.Regex.IsMatch(pdfTextStr,"Subject:.*Thank you",RegexOptions.IgnoreCase)

The above expression would result in True if the "Subject : Thank you" exists in the PDF.

You could then perform the Renaming of the File using the Extracted Name value from the First Expression and Appending the Thank you word to it.

Thank you for this supermanPunch. I am still quite lost this is my first ever automation. I downloaded the PDF reader and now i have my first step:
1 - Read DF text which I connected to the PDF sample I have.

Could you please tell me where I can enter that Regex formula?

@amenoufy , You could assign the First Expression to a String variable using an Assign Activity.

The Second Expression you could directly use in the If Activity.

It says Regex is not declared and PDFtextSTr is not declared either. How can I fix that? Thanks in advance

@amenoufy , For Regex import System.Text.RegularExpressions.
image

As Mentioned Earlier in my Post above, pdfTextStr is String variable that is assigned to the Output Property of Read PDF Text Activity.

Sorry for annoying you but I am thankful for being of great help. I can not seem to find the import button. Where can I locate it?

@amenoufy You are using Uipath studiox and the screenshot shared by @supermanPunch is of UIpath studio. Thats why you are not able to see the imports.

I fyou switch to studio you can get it. Home->License->profile->Studio

@amenoufy ,

We would require to know if you are using Studio or StudioX for your Task/Process creation.

From the Tags in your post, you did Mention Studio.

Apologies for the wrong tag. I have successfully changed to Studio

  1. What should I put in the “To” Tab for the VB expression provided:
    System.Text.RegularExpressions.Regex.Match(pdfTextStr,“(Name:).*”,RegexOptions.IgnoreCase).Value.ToString.Trim

2)How can I add the pdfTextStr as a variable output string and where should I add it?

  1. How can I rename the file with the output provided?

I would very much appreciate it if you can walk me through the solution to get the output please. Maybe by perhaps creating the actual workflow so i can see it successfully made.

Thanks in advance

Hi @amenoufy ,

Check the Below Workflow :
PDF_Rename_WithDataValue.zip (122.1 KB)

We assume here that Thank you is a Constant keyword which we already know before hand.

First, Check the files present in Input folder, then Run the Workflow and Check the Input folder, the pdf files should be renamed to their new names.

Let us know if this is not what was expected.

Hi SupermanPunch,

Firstly I am very grateful for all your help so far. Unfortunately I encountered this when I opened the file:
Do I need to import something else?

Thanks in advance!

@amenoufy , You would need to install UiPath.PDF.Activities Package from the Manage Packages.

THANK YOU!!! It worked.

Thank you for your patience and all your help!

1 Like

SupermanPunch I need your help again please.

I am now trying to extract a series of numbers from a PDF text that are in this format:

20220101_1234
They will always be 8 digits + “_” + 4 digits (different in each PDF)

After I retrieve that number, I will need to look for that number in an excel, where it will be under “ColumnA”. From there I need to get the data that is in “ColumnB” and rename the file with that.

For example:
Step 1: Retrieve the 8digits_4digits from PDF - In this example: 20220101_1234 from the PDF

Step 2: Search this number in the Excel under column name “ColumnA” (the excel is constant) to retrieve the numbers from “ColumnB”: In this example I find numbers 0002 under “ColumnB”

Step3: Rename the file with the numbers retrieved from “ColumnB”: In this example the file will be renamed 0002

Is that possible with UiPath studio?