Print webpage as pdf

J111 · April 7, 2020, 3:10pm

Hello Community,

I built a bot to open a web page in IE, scroll, and take screenshots until it reaches the bottom of the page.

Here’s my question: Is it possible to grab the entire web page and print it as PDF? If so, how?
When I tried this before, it captured the visible part of the page but could not scroll and capture the entirety of the page.

Thank you for your time and wishing everyone well.

msan · April 7, 2020, 4:18pm

Give a try to wkhtml2pdf

https://wkhtmltopdf.org

J111 · April 7, 2020, 4:42pm

I see how that could be helpful but I do not have a developer background and do not know what a precompiled binary is or how to download one.

msan · April 7, 2020, 4:53pm

You have installers (wkhtmltopdf).

If you don’t know how to add it to you PATH (add to the path windows 10 - Google Search), call it by its full path, for exemple:

path-to-wkhtmltopdf.exe Default title | Domain.com path-to-my-output-file

%PROGRAMFILES%\wkhtmltopdf\bin\wkhtmltopdf.exe https://forum.uipath.com/t/print-webpage-as-pdf/209795 %USERPROFILE%\Desktop\post.pdf

J111 · April 8, 2020, 2:55pm

This looks helpful but still quite technical for my abilities.

Do I download the win32/win64 for windows vista or later if I’m using Windows 10? How would I put that in my workflow?

Is there another way to do this?

msan · April 8, 2020, 3:51pm

Yes, this is working for Windows 10. In your workflow, use StartProcess with:

FileName: the path to wkhtmltopdf.exe as FileName, for example:
"%PROGRAMFILES%\wkhtmltopdf\bin\wkhtmltopdf.exe"
Arguments: A string with the page’s url and the path for the output pdf, for exemple: "https://forum.uipath.com %USERPROFILE%\Desktop\forum.pdf" will output the page into your desktop as forum.pdf.

J111 · April 8, 2020, 9:23pm

I downloaded it for Windows 10. What do I do next? A few folders were created with lots of data inside and I’m unsure of the next step.

msan · April 9, 2020, 8:33am

try open the cmd, paste the following line and [ENTER]

%PROGRAMFILES%\wkhtmltopdf\bin\wkhtmltopdf.exe https://forum.uipath.com/t/print-webpage-as-pdf/209795 %USERPROFILE%\Desktop\post.pdf

www.google.fr/search?q=windows+10+run+console

J111 · April 17, 2020, 2:50pm

Hello @msan, I used the links you shared and downloaded the files. wkhtmltopdf.exe file was not downloaded and this is the error I get when I run cmd- %PROGRAMFILES%\wkhtmltopdf\bin\wkhtmltopdf.exe Print webpage as pdf %USERPROFILE%\Desktop\post.pdf- ERROR- ‘C:\Program’ is not recognized as an internal or external command,
operable program or batch file.

msan · April 17, 2020, 3:51pm

Hi,

Please try it with double quotes

"%PROGRAMFILES%\wkhtmltopdf\bin\wkhtmltopdf.exe" https://forum.uipath.com/t/print-webpage-as-pdf/209795 "%USERPROFILE%\Desktop\post.pdf"

If the installer set the path directly (I don’t use the installer so I don’t know if it does), you could just try:

wkhtmltopdf https://forum.uipath.com/t/print-webpage-as-pdf/209795 "%USERPROFILE%\Desktop\post.pdf"

J111 · April 20, 2020, 2:48pm

Hello,

Thank you for your help but it’s still not working.

I typed both of those into cmd and it wasn’t recognized. I dl the wkhtmltopdf and the wkhmltopdf file inside the bin folder and pdf from wkhtmltox folder.

Not sure how to proceed.

sbarbaro · April 22, 2020, 4:29pm

Hello everyone,

Here is my approach to saving a webpage as a pdf using windows 10 pro and chrome. Use a start process activity to start headless chrome with arguments to save/print to pdf. It works pretty well and renders web pages as you would expect. I have read of issues with formatting using some approaches. If you have chrome installed, then no additional software installation is required. I found this approach more reliable than printing as pdf through the chrome user interface.

If your robot/app needs to enter values into a web page form prior to saving, I would consider entering those values and then saving the resultant html file (uses the windows save dialog which IMO is much more friendly to automation than the chrome print dialog). Then feed the html to the headless chrome.

Perhaps this is all easier with Internet Explorer or Edge, but I needed Chrome.

SavePDFofWebpageUsingHeadlessChrome.xaml (5.7 KB)

Caveats:

Running headless chrome instances on my machine (Windows 10 pro) seemed to generate a lot of background chrome processes that were failing to close/exit. Then as I continued to use my robot to generate PDFs, the system would freeze after it had too many chrome instances (I think anyway). So I upgraded Chrome, after which nothing worked. This sent me down several rabbit holes. At the end of the day, I needed to update my chrome driver to match the updated chrome. After both Chrome and Chrome Driver were updated, I no longer had issues with chrome instances failing to exit. Prior to the updates, my system would freeze at about 20. I’ve tested the above with 60 separate instances with no issues.
The approach above is I believe a low tech/ newb way to acheieve some of the functions of puppeteer.
See GitHub - puppeteer/puppeteer: Headless Chrome Node.js API

Topic		Replies	Views
Download Webpage as PDF in as Few Steps as Possible Studio uiautomation	11	1248	February 27, 2024
Print file Activities uiautomation , pdf , activities , studio	6	555	May 2, 2023
Saving a webpage as PDF: Ctrl+p ---> "Save as PDF" Studio pdf , activities	10	4004	April 26, 2021
HTML TO PDF USING wkhtmltopdf Activities question , system	3	139	February 19, 2024
Unexpected behavior when printing a webpage to a PDF Help	9	1511	September 5, 2019

Most Active Users - Yesterday
ashokkarale
anjani_priya
Dheerendra_vishwakarma
Parvathy
Aakash_Singh_Rawat
Luis_Fernando
bjorn2390
neco
pere
Shiva_Nikhil
More details...

Print webpage as pdf

Related Topics