Need to Extract the First line information only for the specific field

Description

Hi There,
I am extracting the information from PDF. After reading from PDF i am storing into string and fetching the values.
There is a field called “Aggrement:” associated with some value, i am able to read the associated value but along with that i am getting the next field column name(Description) aswell in the output which i don’t required.
I only want the value associated with the field “Aggrement”.How to restrict to not get the next column name.

Aggrement : ABC123 XYZ
Description : Welcome UiPath

Code that have used : strPDFData.Split({“Aggrement:”}, StringSplitOptions.None)(1).Trim().Split(" "c)(0)

Link

https://cloud.uipath.com

Date

2025-02-20

Related UiPath products

Studio

@avinashy

if you want to use split then try this

strPDFData.Split({"Aggrement:"}, StringSplitOptions.None)(1).Trim()..Split({"Description:"}, StringSplitOptions.None)(0).Trim()

using regex

System.Text.RegularExpressions.Regex.Match(stringhere,"(?<=Agreement :).*",RegexOptions.Multiline).Value.trim

cheers

Hi @avinashy

Use a Regex to only extract text between “Aggrement” and “description”

result = System.Text.RegularExpressions.Regex.Match(inputString, "(?<=Aggrement : )(.*?)(?=\s*Description)").Value.Trim()

It will also work when they appear in a single line:

I hope this helps.
Happy Automation :star_struck:


@Anil_G Thanks for the response. Regex code is giving blank value.However the Split worked.
One thing i noticed here ,please refer the screenshot.
In the actual pdf the field Ext.Agr No: contains 1/1-1/31/25 but after reading the pdf the string output is displaying the above field values Country & currency - US,USD in the Ext.Agr No:.

So here i want to fetch only the value related to field - Ext.Agr No:

Can you help here.

@V_Roboto_V Thanks for the quick response. The regex is giving the blank value.

Hi @avinashy

Is the spelling of “Aggrement” correct?

@Anil_G @V_Roboto_V this is how the pdf looks . i have commented the values the sensitive data.

I am trying to extract the value for the field - Ext.Agr No:

I want to get that single line value not to include the below description and all.

1 Like

I gave as Aggrement but you can use the correct field name - Ext.Agr No:

1 Like

Hi @avinashy

Is the format of the Ext.Agr No fixed or does it change?

You could use this Regex just to extract that number:

\d{1,2}\/\d{1,2}-\d{1,2}\/\d{1,2}\/\d{2,4}

Or, use this Regex but extract the Group 2:

(?<=Ext.Agr No:\n)(.+\n.+\n)(.*)(?=\s*Description)

@avinashy

first thign change the value in regex with spaces special characters to match with what you get…use locals panel to check how the value look

pdf would not help…check the data from locals panel while in debug mode or read the odf and then write to notepad and check how the data looks…that would help

cheers

Hi @avinashy

Can You try this Regex Code in an assign statement:

System.Text.RegularExpressions.Regex.Match(originalString, "(?<=Ext.Agr No:\n)(.+\n.+\n)(.*)(?=\s*Description)").Groups(2).Value.Trim()

Here is the Output:


This is how the data is appearing in locals panel

@V_Roboto_V i am getting the blank value as output.
Attached the locals panel how it is appearing.

@avinashy

I dont see agreement in this at all

But try this

System.Text.RegularExpressions.Regex.Match(stringhere,"(?<=Agreement\s*:).*",RegexOptions.Multiline).Value.trim

If it is there is differnet line then try to split the data with the key word first then split with neeline and based on index get the value

Cheers

Hi @avinashy

Did you try the Regex code in the immediate panel and check?

Can you paste a snippet of that string in regex 101 website and see the format of the string (next lines and spaces etc) and paste that regex there and check?

If “\n” is not working use “\s” in the regex:

(?<=Ext.Agr No:\n)(.+\s.+\s)(.*)(?=\s*Description)

I have tried to replicate the same. it seems to work fine.

@V_Roboto_V I have pasted the string which i am getting in the locals panel and added the regex.

Not sure where its going wrong.i am getting value as blank

Thanks for this SS. I’ve got an idea

Hi @avinashy

The problem was with the “\r\n”. We had to use escape character for the “\”

Here is the updated regex:

(?<=Ext.Agr No:\\r\\n\n)(.+\\r\\n\n.+\\r\\n\n)(.*)(?=\\r\\n\\r\\n\nDescription)

Extract it from Group 2