Hi there,
I’m doing data scraping from a website with 18,000 records that has endless scrolling, and I’m using following steps/logic to extract data which works fine.
- Assign ExtractedRecords_Previous = 0
- Send Hot Key [End] **** This is to scroll down to the end
- Extract/Data scrap and count ExtractedRecords_Current
- if ( ExtractedRecords_Previous = ExtractedRecords_Current ) *** Stops Scrapping
But when scrolling down for long and extract approx. 10K to 11K Chrome TAB crashes with Error Code: Out Of Memory . From researching looks like it’s because of UiPath Browser Extension consuming memory.
If I close/refresh it opens but I have to start data scrapping all over again. Is there a way of detecting before it happens and handle it Or permanent solution for this?
Please I need help.
CC: @loginerror @Palaniyappan @marian.platonov @moosh
|| UPDATES ||
I have used a script on the browser console just to isolate UiPath Extension and same error came back. Therefore it is not caused by UiPath Extension.
// Declare some constants
const MAXIMUM_NUMBER_OF_TRIALS = 5;
const MINIMUM_SLEEPING_TIME_IN_MS = 500;
const MAXIMUM_SLEEPING_TIME_IN_MS = 2000;
// Utility functions
const sleep = (time) => new Promise((resolve) => setTimeout(resolve, time));
const randomNumber = (minimum, maximum) => Math.floor(Math.random() * maximum) + minimum;
const randomSleep = () => sleep(randomNumber(MINIMUM_SLEEPING_TIME_IN_MS, MAXIMUM_SLEEPING_TIME_IN_MS));
// How to get at the bottom of an infinity scroll
let currentScrollHeight = 0;
let manualStop = false;
let numberOfScrolls = 0;
let numberOfTrials = 0;
while (numberOfTrials < MAXIMUM_NUMBER_OF_TRIALS && !manualStop) {
// Keep the current scroll height
currentScrollHeight = document.body.scrollHeight;
// Scroll at the bottom of the page
window.scrollTo(0, currentScrollHeight);
// Wait some seconds to load more results
await randomSleep();
// If the height hasn't changed, there may be no more results to load
if (currentScrollHeight === document.body.scrollHeight) {
// Try another time
numberOfTrials++;
console.log(
`Is it already the end of the infinite scroll? ${MAXIMUM_NUMBER_OF_TRIALS - numberOfTrials} trials left.`,
);
} else {
// Restart the number of consecutive trials
numberOfTrials = 0;
// Increment the number of successful scroll
numberOfScrolls++;
console.log(`The scroll #${numberOfScrolls} was successful!`);
}
}
console.log('We should be at the bottom of the infinity scroll! Congratulation!');
console.log(`${numberOfScrolls} scrolls were needed to load all results!`);
I wish could be a way of detecting this and handle it before it happens.