[SOLVED] How to set selectors for getting data from products details pages

Hello!
I am learning RPA using UiPath and these days I’m trying to get some data from some online store, and I have a problem.

So far, I can read a category page like https://www.emag.ro/laptopuri/c and can scrap the product names and links to each product page and put them into an Excel.
And then I can open each page in a browser.

Now I want to read the title from each product page, for instance from page: https://www.emag.ro/laptop-gaming-asus-tuf-cu-procesor-intelr-coretm-i7-8750h-pana-la-4-10-ghz-coffee-lake-15-6-full-hd-8gb-1tb-nvidiar-geforcer-gtx-1050-4gb-free-dos-black-fx504gd-e4075/pd/D703CFBBM/

PROBLEM: I don’t know how exactly to set the selector(s) in this case.
Right now selectors are as in the image below, but this only gets each time the title of the page with the list of products and not the titles of each product page.

Can you please help me with this?
Thank you!!

I read a few long hours on this great forum, but so far those solutions haven’t helped me.
I’m sorry I haven’t noted down all those posts so I can’t indicate them now, but these didn’t help me:

Hi! Did you solve your problem?

Hi, @carmen , and thanks for the kind question!

No, I still haven’t found a way to make it work, after a few days of trying :frowning:
And a studied quite a few tutorials, but they seem to only show very simple cases where the page structure doesn’t change much (maybe some of the page title), so the selectors are easy to get right automatically.

But in the pages I’m trying to get (mentioned in my first message) it just doesn’t work as I thought it should.
I tried indicating some anchor element, or to describe the page structure taken from html (well, the way I understood I should write stuff in the selector editor)… but no luck so far.
For instance, it gives an error “Cannot find the UI element corresponding to this selector: webctrl tag=‘h1’ parentclass=‘col-xs-12 col-sm-9 col-md-10’”

And I really feel bad about this, because everything else seems in UiPath very nice and easy to comprehend.

So I was hoping someone with more experience would show me a working model that, for instance, can read details from the product pages of the first 2 products from that list so I have something to study and learn from (because later on I’ll want to scrap other text and pictures from each product pages as well).
Again, my beginner thought is that the problem is how to make the selectors for various texts and images in the page (I can open and close a browser window with each product page).

If you could build a working little project for me to learn from, it would be so great!
(or at least recommend me somebody else who could do such a thing to help me understand this stuff)

Thanks for any help!

ok let’s see …you need to get all the products right? You said that you are not getting the title? well as far I saw now, I can get the tittle as you said without open the product.

First, are you using Data Scraping? I guess you already read about Data Scraping but maybe you are doing in a wrong way.

Anyways, here is the link.

Look this small example … I am getting the title and the price and writing that in a excel file.

Sorry my english is not the best. But I hope this help you to understand.Main.xaml (6.9 KB)

Thank you for the quick answer, @carmen !

Oh, so I haven’t explained very clearly: I am interested in getting data from each product page (not from the products list page).

Yes, as you’ve already guessed, I can scrap data from the product list pages: product name, product URL, product image URL, and I can also download the product images from that list. I can do that even if the product list spans multiple pages.
And I can open each product page in a new browser window, and later close that window.

Now I want to understand how to get from each product page:
– the product name (it should be possible to get any data from any page, right?);
– the product images (they are in some sort of images carousel);
– the product description;
– the product detailed characteristics.

Currently, my intended workflow is like in the image below.

I tried to use the Attach Browser activity with selector <html app='chrome.exe' title='* - eMAG.ro' /> or with <html app='chrome.exe' />

I try to get the page title from each product page using Get Full Text activity inside the Attach Browser activity (also tried with Get Text activity).

  1. It gives the error “Cannot find the UI element corresponding to this selector” if I try to use the partial selector <webctrl tag='H1' class='page-title' /> or the partial selector <webctrl tag='H1' parentclass='page-header has-subtitle-info' />
  2. With the partial selector <webctrl tag='H1' /> it only grabs the title of the product list page and not the separate title of each product page.

So I’m guessing it’s all about those selectors, but I still don’t know how to set them up.
.
.

UPDATE - scrapping page title works :smile:

I extracted the Get Full Text from its Attach Browser activity where it has been automatically placed when using the Web Scraping wizard, and I placed it (with partial selector <webctrl tag='H1' />") into my custom Attach Browser activity (no declared selector) that I had previously manually placed inside my custom Open Browser activity which receives each product page URL as a changing parameter.
So, now the page title (the ‘h1’ tag) is scrapped correctly.

(It seems to me that the original Attach Browser maintained some fixed context referring to the webpage using at declaration time, a context that wasn’t changing when I was modifying the selectors later, and only by moving the Get Full Text outside of the original Attach Browser could my custom partial selectors adapt to each new webpage opened in the browser.)

Now, let’s see how I can extract the rest of the data!

Good! … Well I think that you are doing well, So Anyways the title as I can see, You also can get from the first page.

I will do something like that:

  1. Scrape all the products from your first page (I get the Tittle as you call it, and also the link) I guess you already had this point.

  2. Inside a for each, open browser with every link that I got in point 1.

  3. Inside the browser I will scrape the images, also here I can get the link, I dont know if you wanna save this images or just have the link … Http: … .jpg.

  4. Click to show more information.

  5. I will scrape the description, I can see that the Description is inside a table so Is better to get the whole table to have all the data.

  6. Also the specifications are inside a table. So you can scrape that data …

Look this small example … I did just for 3 products and I print the data in a Write line … that’s is not a good practice … but I did just to check if I was getting the info.Main.xaml (15.4 KB)

Let me know if this help you.

1 Like

Thanks so much, @carmen!
Once again, a truly help…y helping hand!

And you know, that’s exactly what I just started to do, so your ideas fill me with even more confidence.
Tomorrow is another day (as they once said in a famous movie); but I’ll study your file (I haven’t opened it yet, it’ll be a surprise) and get on to solve the rest of this stuff with so much more enthusiasm than that movie character :))

I’ll let you know what else I learn (or don’t understand), so that it might be of help to others.

1 Like

Yes, your file has been of great help, @carmen!

  1. And since I now can scrape the textual data from each product details page, I would like to try to get & save in a MSWord document the entire Description section as it is displayed in the browser (including images, alignments, text formatting etc.).
    Can this be done (without manually creating a new HTML table in word and getting the text or image from each table cell from the browser, as some other threads on this forum were suggesting)?

  2. Can I insert text and tables into a Word document anywhere I need, or just by using Insert DataTable (inserts tables one after the other with no space between them, in the order I placed the activities in the workflow) and Append Text (which in my code always inserts text after those inserted data-tables).

Thanks again for any help!

well in that case … I can Imagine something like… opening the web page … select … copy … open word … and paste … probably will be the same … not sure …

and your second question not sure … I never tried that before.

I Suggest you … create a new topic with each question and close this one :slight_smile: probably somebody with more experience could help you.

regards,

  1. Great! That’s the plan that I (vaguely) intended to study.
    I’ll take your answer as a confirmation, and I’ll definitely explore that idea tomorrow.

  2. My plan is, first, to place some dummy texts in the Word file, and later, to Replace them with the proper info obtained from scrapping.

Thanks for all the advice!

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.