I have a list of PDF URLs in a datatable. I would like to open each PDF and scrape some data.
Inside the For Each loop, I am struggling. If I open the URL directly, the Read PDF command does not seem to work on a webpage. I don’t really want to download each PDF as I really only need a small amount of data out of each. I tried opening Adobe Pro, with the intent of sending a hot key (Ctrl-O) to open and paste the file name in. When I open Adobe Pro with the Open Application, it opens, but throws an error about timing out (seems to think it didn’t open?). When I open Adobe Pro with Start Process, it opens, but then doesn’t receive the Hotkey properly (doesn’t open the File Browser). Any suggestions on how to approach this?
You could try to get a very general selector (whole text as innertext or so) with wildcards like *
and then get attribute activity > text > cut out what you need with regex or better yourString.Split("\n"c). or “dot”
Have you tried the PDF Activities from “Manage Packages > Official >UiPath.PDF.Activities”? But for them I believe you need to download them.
Google Chrome is pretty good and simple for viewing PDFs.
Hi TastyToast. Thank you. I did download the PDF activities. If the PDF is downloaded, “Read PDF” works a treat, and I’m able to slice out the information I want. I’m just trying to avoid downloading loads of large files when I only need a fraction of the data (I still need to start with all the data though as which page the information appears on is not consistent). If I open the URL in a browser, “Read PDF” doesn’t work and I can’t seem to get the PDF to open properly in Adobe. It’s this opening in Adobe piece that is the issue.