open an Editable PDF filled with an individual’s information
read the information filled by the individual in the Editable PDF
output the information read into different variables for later processing
It should be easily done but I’m having trouble getting UiPath to read the PDF. I’ve tried screen scraping, get text, and the anchor base funcionality with no luck.
Thanks for providing the PDF. I have tried this using both PDF and PDF with OCR and both work pretty well compared with the usual success rate of OCR. Please see the attached file and let me know if you have any questions. Please note the file was not open when I ran this.
Hi @richarddenton - Is there a way to extract only single fields, not entire OCR, without opening the file?
The PDFs I’m using, like the user’s examples, have editable text fields and aren’t images. I would like to use get text, but it doesn’t work without opening the files.
I know this is a super late answer, but I will answer it anyway in case any future users have the same problem:
As instructed in UiPath Academy, by using PDF activities (Read PDF, Read PDF with OCR) you can read the file without opening it. However the output will be the whole file.
If you only want to scrape a single field - a part of it, there is no other way but to open the PDF file.
Hi @richarddenton, I am having the same issue as @gonmartins, I downloaded and run your solution but in my case I still have the same problems:
The Read PDF Text activity scrapes all text in the PDF, except the information in the editable fields.
The Read PDF with OCR activity does the same, but the information in the editable fields comes out garbled.
I’ve tried using opening the file and using Anchor base to extract field by field, and it works for certain fields, not for others and when I run it again the fields that previously worked don’t work anymore and vice-versa.
Any ideas? I feel pretty much frustrated. Using the latest Studio release 2018.2.2
Thanks
The Read PDF Text activity scrapes all text in the PDF, except the information in the editable fields.
The Read PDF with OCR activity does the same, but the information in the editable fields comes out garbled.
I’ve tried using opening the file and using Anchor base to extract field by field, and it works for certain fields, not for others and when I run it again the fields that previously worked don’t work anymore and vice-versa.
Any ideas? I feel pretty much frustrated. Using the latest Studio release 2018.2.2
Thanks
Did you ever resolve this? Running into this issue currently.