I automated the download of a PDF document from a webpage. From time to time the url for the webpage changes hence the bot does not find the pdf anymore.
Does anyone have an easy approach to check regularly if the url changed or any other idea to reduce maintenance time for changing the url manually.
If I understand the scenario correctly, I’d ask that you try building a handler which is executed on exception within the Open/Attach Browser container.
This handler(sequence) should include a ‘get attribute activity’ with the attribute property set to “url”, if the url has changed, then you can navigate back to what it’s supposed to be and attempt to re-fetch the PDF.
I understand the suggestions, but from my point of view they are not applicable in my case.
Link for first access was:
Link for second access is:
I am directly accessing the pdf by using the url created by the pattern above, where only 202011 is changing according to the month (202012 for December). This monthly change we can handle, the exception occurs thanks to removing “/2020-11/” from the link pattern.
They removed a part of the link which is affecting our pattern. By using wildcards we would not find the pdf anymore and get “page not found”. I tried “get attribute” as well, but was not able to get the url.
It is tricky since the url pattern changes from time to time, I guess many people are facing these issues and solve it by manually adjusting every time. I am wondering if there is an easier, and more important, automated solution available which I do not know yet.
I don’t think there is a way to guess the next URL (when it changes).
But this URL is always included in the web page that you start with. Maybe you can load the page in a browser (or download the html page and parse it), and then locate the correct URL.