Data Scraping from multiple table in one page html

Greetings

I have a difficulty in developing my program. Let’s say that I have a html report that has several tables inside. However, I only need the data from the tables which have the keyword “Machine Name”. The amount of the tables which have that text is random (it could be 5 tables,10,11, etc) and the row count of each table is also random (it could has 4 rows, 10,20, etc) because the html is generated from a report.

I have managed to try the data scraping feature from UiPath and I could get the data if I were extract from each tables one by one.

So, I would like to create an automatic extraction from this html page. so far I have tried this workflow:

  1. open browser and the html
  2. send hotkey ctrl+f (for find)
  3. type “machine name”
  4. get the total count of “machine name” in the html and set that as limit for counter
  5. create a do while with condition that the counter hasn’t reach the limit
  6. extract the data table which contains “machine name”
  7. append the result to excel
  8. click next in the find to search another “machine name”
  9. loop until it reaches the end of the result

With that workflow, the program could extract for the amount of the limit but the problem is that it only extract the data from the first table for the x amount equal to the limit. I am guessing that I put the wrong argument/input in the step 6 (extract structured data)

Please help. Thanks

please take a look at this result example from my excel. notice that it repeats the data only from the first table.

The only problem i saw, it’s if your first table get few row with “machine name” i think that, your workflow, loop on this, wich mean per row with “machine name”, he took the entire datatable, put this on the excel, go to the next row who’s still in the same datatable, so he appen the same datatable, go to the next row, and loop like that… May be i’m wrong, i need to see your workflow for getting more information, can you share it? :slight_smile:
After that you’ve get few possibility, you can get information by selector for know on what table you are, and try to creat an boolean or an int your increment for saying “ok i’ve already work on this table, go to the next table plz”

Hope you understand what i trying to say, sometime i lost myself :slight_smile:
Don’t hesitate to ask if what i say look weird :slight_smile:

Regards,

Sure. here it is my workflow. however due to security reason, i could not share the html file as there are too much of private data. so I could only give the screenshot example of the html file. notice that i’ve marked the data that i need with blue box

Main.xaml (17.4 KB)

Appreciate the help. thanks

I think yes that is indeed my problem. I don’t know how to made the selector/extractor took the data from the next table that is indicated by the “next” that has been done via ctrl+f.
Could you give example for that? Thanks.

(Don’t worry about to not sharing the HTML it’s totaly normal to not share private data :smiley:)

Hope for you the selector on this page are really nicely separated, you’ve got some few solution i guess, my first think is about to open your UI Explorer on this page, and watch what selector can clearly define the first datatable with “Machine name” to the second specific table.

And my first idea it’s to stock this specific selector into an string and check on the datable if this selector if in the strin, if not, appen this datatable to your result, and add this selector to this string, if it’s already in it, just go to the next row. Try this, if it’s doesn’t work we will found another solution.

I never got this problem that why i just write some idea, but not an real solution. And in UIPATH like other work in dev, you’ve got some few solution to one problem, first step find one solution, when it’s work and you get time, try to optimise this solution or try other solution.

Hope that can help

Regards

upon further inspection with the HTML, i found that the data scraper were able to select the table nicely one by one.
After scraping manually for several tables, I found that the only difference is the table row attributes where it varies for each table (for example see screenshot). Do you know how to make UiPath could extract data from each table row that contains it?

thanks

result%203

Hummm, if we took the idea of yesterday, about this array of string, you’ve just to use Get Attribute activities who can recup in string the value of one selector, and in your loop, check if this value is already added in this array of string. if no, adding, take the datatable, appen it to your final datatable, and go for the next row, if it’s already exist, go immediatly to you next row.

Try this first idea and tell us if it’s works :slight_smile:

In cases such as this, I have found the FindChildren activity quite useful. Depending on the amount of data you wish to scrape it can be used to grab the entire dataset based on correlated attributes of elements between the two tables. It works by taking the full HTML of the web page, then you can filter for those elements which share the same attributes. So if you filtered by < colname=‘machine name’ > then you could arrive at all the details in Machine Name across all tables through a single activity.

This might be of some help