How to stract data from a form in PDF?

Gmar · June 3, 2020, 10:57am

Hi everyone,

I am new using Uipath and I am trying to stract data from a PDF that contains a formular, and then transform this data into a Dictionary of strings.

The problem I am facing: I tried to use read pdf text activity and then capture the data i want to keep with Regex. I can’t make it work at all. There is also some kind of invisible data in the inputs of the formular that I can’t delete or control.

This is the pdf I have to read (since I am a new member i can’t upload anything)

Could you please help me?? Thank you very much!

William_Blech_Sister · June 3, 2020, 11:39am

Hy,

You could use the ‘Read PDF Text’ and then past it on an Text File. As the document structure is a form, it may take some time, but I believe it can be done.

Then, in the text file, use could use Regex to find the text you want.
I myself converted the pdf to a text and I could not find any ‘invisble data’ as you mentioned.
Could please explain it better?

Thanks

Gmar · June 3, 2020, 12:05pm

Hi, thank you very much for your reply!

Well, I already did what you explained. In the sequence, I read the pdf as it is, I don’t modify anything, I don’t fill in any blanks, so I suppose it should just return the form, not any other information.

If you have an output txt, there you can see there’s a part where it says “NIF Apellidos o Razón social Nombre” and then just below, some numbers and some names. That information is hidden in the pdf, and I don’t know how to handle it, and when I write in the formular, it gets mixed with what I write.

I hope I have explained it well, as long as I can’t upload any document!

William_Blech_Sister · June 3, 2020, 12:15pm

Hy, I found the hidden text you mentioned, it is very odd indeed.

As an alternative, you could open the file using the ‘Start Process’ Activity and then use the ‘get text’ to extract text from the file. I managed to test it sucessfully.

Topic		Replies	Views
How to extract form values or editable text from PDF files? Help	3	4517	November 21, 2018
Read PDF to populate certain fields in Excel Forum question	5	817	August 16, 2022
Need help in making pdf to excel Studio studio , question , activities_panel	2	641	July 9, 2021
Unable to read PDF file which has unstructured template Studio studio , question , activities_panel	6	412	December 20, 2023
Extract data from pdf document Help pdf , activities , question	18	2085	February 3, 2020

How to stract data from a form in PDF?

Related topics