Web scrapping while taking screen shots for each page being scraped

I am doing screen scrapping for a online shop’s listings. I could only get the links of the shop listings for each scrap session. The question here is how do I do some activities inbetween each new page is being scrapped. ( e.g. taking a screen shot of each page after it’s being scrapped before moving onto the next page)

2 Likes

You can achieve this:

  1. Get the total page count
  2. Loop through each page
  3. Data Scrapping (without next page selector added)–> so that it extracts the data from the current page
  4. Take screen shot
  5. Continue with all the pages.

Cheers

1 Like

@Penganimation Process.xaml (12.8 KB) This is an example of what you are trying to achieve.

1 Like

I feel we should explain the steps instead of providing the XAML.

With this we are enabling the developer to understand the steps and develop by themselves which will enable them and help the to understand the concept in better way.

:slight_smile:

2 Likes

You are right::+1: I will prepare a document and send it across to you

2 Likes

Wow this community is amazing! Thank you so much everyone !

1 Like

it meas to loops through each page manually without using the auto next page function that comes with web scraping right?

If i understand correctly. You are suggesting to do data scraping first then go back to page one to do screen shot page by page all over again?

1 Like

That’s correct :slight_smile:

3 Likes

thank you so much!

3 Likes
  • Attach Browser using Attach Browser Activity (All Next step should be executed inside Attach browser Body).
  • Assign a Variable Page = 1
  • Drag Do while Loop activity (All Next step should be executed inside Do while loop Body)
  • Drag another assign activity in to Put Previous Page variable and expression type Page + 1 it should be like Page = Page + 1
  • Create another variable DynamicSelector = “<webctrl aaname=’” + Page + “’ tag=’A’/>” here Page is working as a dynamic selector, if the page change it will automatically detect the page.
  • Drag a Click to Click on Second Page and Cut the Selectors from and then Click OK and use that Selector in 5th Step For Example if your Selector is (<html app = ‘chrome.exe’ title=’your page name’ /> ). Cut everything and paste only ( ) on 5th step
  • In while loop condition Put Page < Your last page number. It should be like Page < 10
  • Now you can do whatever you want to do here

Here is the XAML file I tested Main.xaml (11.5 KB)

1 Like

oh my god, thank you so much
MartianxSpace. The dynamic selector concept is amazing!

1 Like

Did it work in your case? If it did then i am glad that i could able to help you out.

I haven’t tried, but it looks that will work for my case. Thank you so much

1 Like

sorry one question, may I know what does this represent?
<webctrl aaname=’” + Page + “’ tag=’A’/>

if page = 3, then
right?

then what does this selector actually represent? the pointer to the whole page or to a particular part of the page?

I am a bit confused.

Your selector could be different than mine but there would be a page number in your selector. Suppose you used click activity when you were on page one but now you want go to the second page without changing the process you can’t do that because in your selector it’s still on the first page. That’s why we use dynamic selector which will change automatically based on your need. Dynamic selector will pass the value to the selector which could be page 1 or page 5 so on.

If you need any help please let me know

The Page + 1 is a counter which will increse a number by 1 everytime. Suppose currently you are extracting data in page one once it’s over the counter will move to the next page because it will add 1 to the current page. If you are on page 1 then next page would be page1 + 1 which is page 2 and it will do the same thing to othet page as well until you want to stop it.

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.