Data extraction from PDF file

Hridhya · January 10, 2022, 1:40pm

How to extract specific data from a PDF file which is not in tabular form just plain text?

Nithinkrishna · January 10, 2022, 2:07pm

Hey @Hridhya

Mostly it’s based on labels or patterns…

Thanks
#nK

Charbel1 · January 10, 2022, 2:09pm

Hey,

Can you share a sample?

Nithinkrishna · January 10, 2022, 2:10pm

Hey @Charbel1

I don’t have a sample handy, but if you have any PDF samples. May be I can help you with a small POC.

Thanks
#nK

Angel_Llull · January 10, 2022, 3:30pm

Hello @Hridhya,

One of the most common took is “Regex”.

Also: Document Understanding - AI Document Processing | UiPath

Hope it helps!

Vinit_Mhatre · January 10, 2022, 5:01pm

Hey @Charbel1 @Hridhya

Try this example …
Put the pdf file into “PDF PATH” Folder to try this example.
In this i had used the OCR method to extract all the plain text data from pdf and using regex to get the specific data from extracted data from OCR.

Main.xaml (24.8 KB)
invo1.pdf (93.3 KB)

Thanks,
Vinit

Topic		Replies	Views
How to extract data from digitize pdf Studio studio , question , activities_panel	4	31	March 28, 2025
Extract specific data from pdf files Activities excel , pdf , activities , question	3	628	March 13, 2023
How extract specific data by using RegEx Help	12	1926	January 30, 2020
Extracting data from PDF-s Studio uiautomation	6	830	July 27, 2022
How Extract Particulart data from multiple pdf which have same format Automation Starter uiautomation , pdf , activities , studio	10	1392	September 18, 2022

Data extraction from PDF file

Related topics