When data scraping, is it possible get data on a partial html/css match?

Hi all,

When scraping data from the following link, It successfully scrapes the first 5 matches on product name and then skips a number of products.

Under inspection, it seems this is because the class name is different from where the intial match was created. So it is looking for a class name of ‘s-item__title s-item__title–has-tags’ and then the ones that get missed have a class name of ‘s-item__title’.

So my question is, is there a way to edit the data definition so that ‘s-item__title*’ are picked up (if that makes sense)?

Or if that is not possible, can you offer some other possible solution for this problem?

Thanks :slight_smile:

So far the only way I have found to get both sets of data from different html/css tags is to do two separate scrapes and then merge the data tables.

Even then, some of the rows from each set are in the wrong order in excel with NO sort order applied.

hi,
This metadata may solve it.

metadata.txt (521 Bytes)

Thanks,
-Tera

1 Like

Thank you @tera, that did indeed work and everything is in the correct order.

Would you happen to know if the xml schema for the scraping is documented anywhere? I would really like to know more about this as I’m sure manual edits are quite a common thing to do.

Anyways, thanks for your response :slight_smile:

I’m glad to help you

However, no documents published by UiPath were found.
The information I have is probably a fragment and may not be accurate enough to be taught to others.
I am sorry that I cannot help you.

The best way to get accurate information about this is to contact the support team.

I hope you get the information you need.

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.