Extracting data from PDF-s

Hi,
I’m trying to extract data from PDF.
I thought I could convert the same PDFs into text and use regex.
I was wrong. Even these same PDFs give me different orders in the text; of course, regex is not working.
Additionally, I’ll have to extract info - if selected radio buttons and which one
What will be your choice for extracting the data?
p.s. data are fake :slight_smile:


Hey @OzogRPA

Are you using Preserve Format in the Read PDF activity?

Thanks
#nK

Hi,
No, I’ll try with Preserve Format and let you know.
Any ideas for more “advanced” data reading from PDF?

Looks like Preserve Format it’s not helpful in that case :frowning:

Advanced Method would be Document understanding!

Will you recommend any up-to-date tutorial?

Hello @OzogRPA

As per the screenshot which you have shared the format of the pdf is same. So can you try with get text activity?

Open the pdf using pdf reader-> then use get text to get the details.

Advanced method is document understanding which helps to extract data from unstructured odf also. You can do the tutorial in the academy.uipath.com for better understanding.

Thanks