Downloading x number of Files from Websites


#1

Hi,

I am trying to create a bot that navigates to various utility company websites and downloads all relevant tariff pdf files. See two links below as an example:
https://aepohio.com/account/bills/rates/aepohioratestariffsoh.aspx
https://www.psoklahoma.com/account/bills/rates/

As you can see, each website has a different number of links that are formatted slightly different. I’ve been able to run an instance of one download from one website, but I can’t figure out how to have the bot identify all the links on a given webpage and then run it through a loop of downloading and saving the files.

Any help or suggestions on tools to use would be greatly appreciated!


#2

Interesting & challenging, let me try.


#3

How many different providers/company websites are there?

What might be a good idea is to gather a list of all the download links you have and see if there’s a common theme between them which you could use to highlight them (maybe just the fact all these links end in “.pdf”)? You could set the bot to go to each page, and find all elements on it which contain links with “.pdf”, add these to your orchestrator queue, then for each one as it’s just opening a pdf page, the process would simply be to save them?

Otherwise if you knew that the only things changing would be the month/year on the download links you could copy them all and change those to variables? You could maybe also do this directly without navigating to the various websites, but going direct to the download links?