Navigate To All Links & Scraping All Of Them

studio

#1

Hi Everyone,

I’m quite new on Uipath.

So i have a Web page that i have to record all new entry from that web site to excel everyday.
For example today i have 787 new entry and i get all of them with scraping to csv file.

But, i have to click all links of them and again i have to scraping to csv file.
Every page is the same;

  1. Links are;
    https://www.xxx.com/link/?id=506228,
    https://www.xxx.com/link/?id=506229
    https://www.xxx.com/link/?id=506230

only id part is different

  1. Every elements are the same on that links. So if i can do the “Navigate To” all links i think i can scraping all the data inside of them.

I hope i could explain it :frowning:

Any ideas?

Thanks in advance


#2

A For Each Row loop to iterate through the dataTable. And inside the loop, a simple openUrl + scrape


#3

Could you please explain more :slight_smile:

As i said i am new on Uipath.

Thank you for this!


#4

I’ll see if anyone can pick this up; if not, ping me again soon


#5

Hi again,

Could you please help me :slight_smile:


#6

Ok, so the scraping to CSV part is done and working, right?

Have a look at the dataTables tutorial to see how to loop through the entire table.


(around 10:00 in)

And in the For Each loop, instead of the WriteLine, use an Open Browser with the URL from the CSV file.


#7

Hi Cosin,

Even i am facing same problem, i used for each loop, but i am able go for first link, but not all the links. could you please help me.


#8

Hi,
Could you please upload workflow and excel.


#9


#10

Hi,
It would’ve been better if you could have uploaded workflow.
Anyways by looking into screenshot your using single datatable but i’m not sure how it will identifies the column name/index of url in get row item activity as its not picking from excel.
So could you please do the following.

Drag read range activity after write range and create one more data table and then use For each row activity and pass the column name(url) in Get row item activity and then navigate to activity.


#11

Samsung.zip (188.6 KB)


#12

Hi,

find the attached workflow


#13

Hi,

I tried using read range activity, its not working. could u please help


#14

Hi,
Check now. It’s working :slight_smile:
Its saving the report with mobile name and url then navigating to the respective url in same tab.

Here we go.
Main.xaml (24.7 KB)


#15

Hi Dilip,

Thank you its working.

But when scraping the url for “sponsored links it is scraping url in (/gp/slredirect/picassoRedirect.html/ref=pa_sp_atf_aps_sr_pg1_1?ie=UTF8&adId=A00172302RXUJ87S0Z95A&url=https%3A%2F%2Fwww.amazon.com%2FDashboard-Ultimate-Flexible-APPS2Car-Windshiled%2Fdp%2FB071FLGFML%2Fref%3Dsr_1_1%3Fie%3DUTF8%26qid%3D1504154475%26sr%3D8-1-spons%26keywords%3Dsamsung%2Bmobiles%26psc%3D1&qualifier=1504154475&id=6765837114264527&widgetName=sp_atf
)” in this format, so it is not able to find the ui element. throwing error.

without sponsored it is scrpaing link in this format. this is working(https://www.amazon.com/Samsung-Galaxy-J7-Prime-G610F/dp/B01MUSD2ST/ref=sr_1_3?ie=UTF8&qid=1504154475&sr=8-3&keywords=samsung+mobiles)


#16

Yep .
I think its popup url.
looks like sponsors url doesn’t seems to have protocol over which data is sent (https)
First try to add (https://www.amazon.com/“sponsor link” and make some changes accordingly untill that link opens up. If it works then you can hardcode same for all the sponsors links.


#17

Yes exactly. I don’t want to scrap pop URL’s. So is there any option like i can search for "sponsored " on each link and if “sponsored” is present don’t scrap else scrap it.


#18

There is no filter while data scraping .
So one way is to by filtering the DataTable.
Filter the Url column which doesn’t have https in the start line. (search google or forum :slight_smile: )
then copy to the another cell and read from that cell and use navigate to for further processing.


#19

Do selectors help to filter while data scraping.


#20

You can filter the table for other purpose. i’m not sure how do you identify the sponsor link unless you scrape it right?
:slight_smile: