Data scrapping in multiple browser tabs

Hi guys! Hope your doing well.

I have this csv file extracted from data scrapping with urls. Then I am opening each url and I want to scrap more data from all of them and paste it in the first csv file I have created before. Maybe because I am quite new on RPA I am not being able to do that.

Could you please assist with this issue?

Thanks in advance

1 Like

Hey @simao,

Is the data you’re scrapping from the URL’s all in the same format, or is it different pages with different info/layout?

Second, do you need the scrapped info from the URLs, put on the same line (row) where it got the URL from initially? i.e are you looking at each line, getting a URL, going there, getting the info, pasting it to that line, then moving to next line and repeating? or is it something else? i.e Are you going to all the URL’s, scrapping data, going to next one, scrapping…and at the end, you want to paste ALL the scrapped data?

Let me know your method, and i will try helping you with an easy, understandable solution.

PS: Any other info about the CSV data (line/rows) and the data you’re scrapping may help too.

Cheers,

1 Like

Hey @MikeBlades

Yes, yes and yes.

Please see attached a print screen of my data table.
The data is at the same format and layout - it is collected from a single data scrapping process.
The aim is to for each row get the URL on the collumn B, open browser, collect some more data and then paste it onto collum D - and repeat for the next row.

Thanks for your help!

Tbh, the stage I am right now is: I’ve collected the URLs, my robot is already able to open each URL but I cannot extract further data (I am only able to do so for the first URL - so I need some sort of dynamic function)

Hey @simao,

I got ya!

So the way i would approach this is as follows:

use “build datatable” for a NEW data table…lets call it “dtbNewOutput” and populate the fields with:
Description(string), URL(string), Price(string), New info(string).
image

Then add in a “For each row”, inside of which you add 3 “Get row Item” and get the Descrip, the URL & the Price from the original Excel file.

Then add in your already working URL browser open, info scrapping and stuff, and assign the “extra info” from each scrape to the ExtraInfo variable.

Next you want an “Add Data Row” activity and use an array to add all 4 variables to the dtbNewOutput

Then AFTER the “for each row” use a “Write range” activity which will create a new excel file
image

HAPPY DAYS!
Hit me up if you need more info, or other help

Regards,
MikeB

1 Like

Fine
Welcome to uipath community
Hopebthese steps would help you resolve this
—use READ CSV Activity and pass the filepath of above csv file and get the output as datatable named dt
—now use a ADD DATACOLUMN activity where add the new column where you want to insert the scrapped new value
Or if we have the column already ready then we don’t need to use this activity

—then use FOR EACH ROW activity and pass the above datatable dt as input
—inside the loop use a open browser activity
—in that activity pass the input as row(“URL”).ToString so that it will open the browser page we want
—inside this open browser we can use either get text activity or screen scrapping method from design tab in studio to get the value we want and store that in a variable named strinput
—now next to this open browser but being inside the for each row activity use a assign activity like this
row(“your new ColumnName”) = strinput.ToString

—this will write that scrapped data to that new column and get stored in a datatable
—now use a WRITE CSV activity where pass the input as dt which will write back to the same csv file

Cheers @simao

MikeB!!! Thanks a lot, it worked perfectly fine! Oh yeahhh! Cheers

Hey @MikeBlades and @Palaniyappan

Thanks for your hel. Really appreciated - Thanks to you I was able to resolve my problem, but now I face another one.

I am doing that data scrapping (quite manually due to website specifications) navigating through webpages. Unfortunately, the data scrapping only returns duplicated data from the first webpage.

What do you think the issue can be? PSA the Main file.

CheersDS360.zip (3.5 MB)

1 Like

Fine
Were the next page button or element chosen when doing data scrapping

Cheers @simao

hi @Palaniyappan
I already solve it. I couldn’t do the automatic data scrapping and click next page (website issues) - I had to count how many pages the website had, and then open each url edited with the page number. Anyways, it is solved.

Now I am struggle with another thing - is it how it sound like being new on RPA? :smiley:

I have this data table, so I want to open each row url and get some more text. To get that text, I have to click in a “show more” field and collect it. The web site appears to have some kind of bot detector, because after some reps, that “show more” field stops working, DO you know if it is possible to open consecutive rows urls in different browsers (firefox and chrome, for example)

Cheers,
Simao

Hi Simao,

I am facing the same issue you had, it can only read the first page, so I listed all the pages in CSV file and read, open each. but still it can only read the first page information only, can you please tell if you faced this issue and how did you resolve it?

Hi Hamdan and Simao,

Just thinking out loud if this can help. There may be element in URL that can directly take you to the given page (eg. /1, /2, /3 or any other similar idenfier), and that way you may go through each row like you are doing now but this time to the specific page. Open browser, Do the data scrapping, close the browser and again repeat the same activity for the next rows. Please let me know if this works. :slight_smile: