Extract proper pdf value

HI Guys,
I need to extract proper value from pdf.

Post “Read PDF text”
o/p:
Page 1 of 1
Job No 85542
8082379
Invoice Address
RENAULT TRUCKS COMMERCIAL
Wedgnock Lane
WARWICK
Warwickshire
CV34 5YA
REN001
AC21126
24/02/2024
Invoice No
Invoice Date
Customer Order
Account

Question: Can you see the ‘ACCOUNT’ and it’s value is ‘REN001’ are in separate line.
How can i fetch the Account value ?

Hi @nidhi.kowalli1

Try keeping Preserve Formatting True in the properties of Read PDF Text and run the workflow. Share the extracted text here, I will help you with regular expressions.

If not try to share the PDF I will help you.

Regards

code is running in oder version. I dont have that option in properties


20636Invoice.pdf (71.8 KB)

HI @nidhi.kowalli1

Please check the below process with modern activities. I have attached xaml for your reference.


Main.xaml (17.2 KB)

Regards

Hi,

Which version of UiPath.PDF.Activities package do you use?
In my environment, the following works.

System.Text.RegularExpressions.Regex.Match(strPdf,"(?<=Account\s*).*").Value

Sample
Sample20240321-6L.zip (62.7 KB)

Regards,

Hi @nidhi.kowalli1 ,

Can you please share your package versions so that we can replicate it and check what’s the issue here

UiPath version : 2022.4.1
UiPath.pdf.activities: 3.16.0

Hi @nidhi.kowalli1

Please update the UiPath.PDF.Activities to the latest version and keep the run time rule as lowest applicable version and then try to create the flow as shown above and it works!!

Regards

1 Like

Instead can we add any expression in properties section, “Range”
Presently it is “ALL” to read the whole data. can we add any expression to preserve format ?

Hi @nidhi.kowalli1

I have also used the same version as mentioned by you for the UiPath.PDF.Activities - 3.16.0

I haven’t used the PreserveFormatting option as true but still I’m able to extract the data as per your requirement please check the below attached xaml.
Main.xaml (17.2 KB)

Regards

Hi @nidhi.kowalli1 ,

i have downgraded the pdf package and replicated your use case
Kindly check below workflow
Extract_Value_Using_regex.zip (63.2 KB)

I have used below expression in an assign activity

system.Text.RegularExpressions.Regex.Match(text,"(?<=Account (?!Name|No)).*").Value.Trim

Hope it helps you out!

Thank you. Its working fine post upating the PDF-package version and updating property of “Preserving Format as -TRUE”.

1 Like

HI @nidhi.kowalli1

You’re Welcome.

Happy Automation!!

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.