Data Scraping the HTML table

studio

#1

Hi,

      Could some one help me in data scraping from the below sample html table
table, th, td { border: 1px solid black; }
Month Savings
January $100
February $80

#2

Hi @Naga_Varma, something weird was happening when trying to extract the entire html table. I am not sure if that was because of pasting the table here. But figured out a way to do this.

Try to use the Data Scraping wizard, select one of the table cells and choose no in this pop up
image

You get to choose one another cell again from the same column and the entire column would be extracted.
Then choose the Extract Correlated Data option and choose the corresponding Savings column elements to extract the data.


#3

Kaderms thanks for your reply, i am not able to achieve it. below is the sample html text that i am trying.

<!--
<!DOCTYPE html>
<html>
<head>
<style>
table, th, td {
    border: 1px solid black;
}
</style>
</head>
<body>

<table>
  <tr>
    <th>Month</th>
    <th>Savings</th>
  </tr>
  <tr>
    <th>January</th>
    <th>$100</th>
  </tr>
  <tr>
    <th>February</th>
    <th>$80</th>
  </tr>
</table>

</body>
</html>
-->

I know that <th> tag is used instead of <td> tag.
Is it possible to data scrap the above in any way

Thank You,
Naga Varma

#4

Yes that is totally possible.
I am able to scrap the table with the html definition you provided, using the same steps that I mentioned above.
Tried it using https://www.w3schools.com/html/tryit.asp?filename=tryhtml_tables2


#5

I guess this is because of the double line border in your table. I was able to extract the entire table with the below html definition, using Data Scraping.(not sure why it is displayed so). But I guess it is something to do with the html config

<!DOCTYPE html>
<html>
<head>
<style>
table, th, td {
    border: 1px solid black;
    border-collapse: collapse;
}
th, td {
    padding: 5px;
    text-align: left;
}
</style>
</head>
<body>

<h2>Table Caption</h2>
<p>To add a caption to a table, use the caption tag.</p>

<table style="width:100%">
  <caption>Monthly savings</caption>
  <tr>
    <th>Month</th>
    <th>Savings</th>
  </tr>
  <tr>
    <td>January</td>
    <td>$100</td>
  </tr>
  <tr>
    <td>February</td>
    <td>$50</td>
  </tr>
</table>

</body>
</html>