How to extract specific text from PDF

Dear All,

I have to extract the details of the person from pdf file. This content is placed in middle part of the file. I need to extract this specific details only.
Herewith i have attached the screenshot of the pdf file

please advise how do i extract…

Thanks in advance,
Muthu

Hi @muthu.m

Let me know what all fields you want to extract from the above img.
And if you can share the pdf here then it will be more easy to apply regex on it .

Happy Automation :raised_hands:

Best Regards
Er Pratik Wavhal :robot::man_technologist:t4: :computer:

Hi @Pratik_Wavhal,
Thanks for your quick response!
I want to extract all the data.[Name, Reg Number,etc]
Im not able to attach pdf file.

Thanks in Advance,
Muthu

Hi @muthu.m

Actually the problem is that the pdf img is in structured format soeven if do ocr on it then that will give me text all in seperate lines so virtually its hard to implement exact regex and to tell you.

It will be more best if you can upload pdf.

Or you can do one thing. read your pdf with preserve format as True and write the data into Notepad. And then you can share the notepad data here.

So here the problem for u is that you are not able to upload pdf here on forum or another reason bcz of some security reasons of that pdf doc ??

Bcz Forum is giving option to upload doc so. Below is the Screenshot for the same :-
image

Happy Automation :raised_hands:

Best Regards
Er Pratik Wavhal :robot::man_technologist:t4: :computer:

Hi @Pratik_Wavhal,

Please find the attached Text_File.

Thanks in Advance
Muthu

Extract_Data_From_Pdf.txt (836 Bytes)

@sandeep13 @Palaniyappan

Any idea for the above.
Kindly advise

Regards,
Muthu

Hi @muthu.m,
You can extract any specific field , text or tble from PDF using UiPath document understanding and intelligent OCR activities. Use form field extractor.
this video might help:

Hi @muthu.m

So from your img you want to extract Full Name, Reg Number, Mobile Number,
More what all fields you want to extract can you mention them
Mention all the fields that you want as output

Happy Automation :raised_hands:

Best Regards
Er Pratik Wavhal :robot::man_technologist:t4: :computer:

Hi @Pratik_Wavhal,

Please find the attached SS.

Thanks in Advance,
Muthu

Hi @muthu.m

Below is the flow which gives the output you expected :-
Main.xaml (20.1 KB)

The Regex applied for the same :-

Input File :-
Extract_Data_From_Pdf.txt (836 Bytes)

Output :-
image

Mark as solution and like it :slight_smile:

Happy Automation :raised_hands:

Best Regards
Er Pratik Wavhal :robot::man_technologist:t4: :computer:

2 Likes

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.