Data Scrapping Hangs/Freezes on Last Page

Using the data scrapping wizard to scrape a table from multiple pages. On the last page it sits there until the timeout time is reached (30 seconds by default) then continues to the next activity. Is that how this is supposed to work?

The data all gets scrapped properly, but the delay on the last page is annoying. I can reduce the delay by setting the Timeout property, but then I risk the page not fully loading before it times out. How is the data scrapping wizard supposed to recognize that it is at the last page? I would think that once it finds that that there is no “next” link it would move to the next activity, but it seems that that instead of relying on a “complete” status from “WaitForReady” as the signal to move on if no “next” link is found, it is waiting the whole time out period.

Here are the details on data I’m scraping:

http://webapps.rrc.state.tx.us/PR/publicQueriesMainAction.do

Test query inputs
Lease Type: “Gas Well”
District: “06”
Prod Month Range: “Jan 2017 - Feb 2020”

Next button Selector:

Unfortunately, this is in fact how it is supposed to work. The activity keeps trying to click the next button until the activity throws an error. You’ll find that if you set the “ContinueOnError” flag to False, the activity will fail if you have a Next Item selector, indicating that the activity completes by reaching an error with the Next Item selector.

There are ways around this if you don’t like this functionality. You use a while loop which checks if the Next button is active. If it is not, just don’t scrape any more data. This will speed up your automation, but will take more time to code.

A simpler solution is to lower the Timeout parameter to something less than 30000 milliseconds.

2 Likes

Thanks for the quick reply!

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.