PDF TABLE EXTRACTION date

Hi team,
Iam getting this type of table from the PDF
I have to extract the highlighted data…
plz can u suggest how to take approach

hi @Nikhil_Katta
use read pdf activity
and then use regex to extract the text which you want

if above scenarion not works use read pdf with ocr activity

and then use regex

1 Like

hello @Nikhil_Katta I suggest you to use read pdf text activity …if you are unable to find it in activity panel then go to manage packages and install this
image

once it is done please send the output text here so that we can send you the desired regex pattern to extract the highlighted data

Hi
i want regex pattern for “Number of approved visits 8” and i want approved from date and approved through

@Nikhil_Katta

Approved From
image

Approved Thru
image

Extracts 8
image

XAML

Sequence1.zip (1.7 KB)

Regards

Hey!
Assuming that your expected output would be in this pattern: Number of Visits 8,value of approve from and approve thru
I have attached the workflow below,
Approved.xaml (8.2 KB)

or value of approve from,approve through and number of visit
Approve From: (?<=Approved From)\s+\d{1}/\d{2}/\d{4}
Approve Thru:(?<=Approved Thru)\s+\d{1}/\d{1}/\d{4}
Number of visits:(?<= Number of Visits\s+\d+\s+)\d

1 Like

Hey @Nikhil_Katta ,

I have tried to meet your requirements and below is the screenshot of the output as well as xaml file

New folder (2).zip (109.0 KB)
Hope it helps you out!

Hello @Nikhil_Katta
Please follow these steps for your output

System.Text.RegularExpressions.Regex.Match(str_Pdftext,“(?<=Approved From).(?=Number of Visits)“).Value.Trim
System.Text.RegularExpressions.Regex.Match(str_Pdftext,”(?<=Number of Visits).
”).Value.Trim.Split(" “c)(1)
System.Text.RegularExpressions.Regex.Match(str_Pdftext,”(?<=Approved Thru).*(?=DME)").Value.Trim

Happy Automation

2 Likes

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.