Web Data Scraping - Cannot get correlated data

Hello Robot Masters,

I was practicing on the page Major Group 01: Agricultural Production Crops | Occupational Safety and Health Administration, which looks like this:

I want to use one single Data Scraping scope to extract the bullet items’ full texts (both the numbers and the words, highlighted on the screenshot above) and their URLs; however after I selected correlated data I can only get the first bullet item of each group, even if I deleted the idx=“1” attribute. Where was I wrong?

In the modern uiautomation activities, there’s an activity called “Extract table data”(this is similar to data scraping wizard).

You can use this acitivity, indicate target on screen, click on “Add data” , then click on 0111 Wheat, then on 0112 Rice, then on 0131 cotton, then click on tick mark in the selection and finish the selection.

This way you can extract all name from 20 rows. As the URL is dependent on numbers in the names, you can create your data column by extracting numbers(like 0111,0112 etc.,) from the names.

I tried extracting all 20 names this way and it worked. Refer this workflow
A_test.xaml (9.7 KB)

Hi Surya,

Thank you so much for the kind solution and example workflow :grinning:
However I was looking for a more generic solution…If the website got updated and the URL is no longer dependent on numbers in the names, then I have to use another extract steps and try to link the two extractions which might be excessive - that’s why I wanted to “use one single Data Scraping scope” to extract both the names and URLs. It would be appreciated if you can give further suggestions…

1 Like

I tried normal data scraping scope too.

But the problem is that all the names under Industry Group 011 should be extracted to one column, the names under Industry group 013 should be extracted to another column and so on. You can also extract URLs into seperate columns.

So I feel the solution I gave in previous post is the best solution as of now.