Data scraping error when scraping tables


#1

I was trying to use data scraping to extract a table from a website, the table has fixed header column and row so when I start scraping by only one click on any cell, the robot asks me “You selected a table cell, would you like to extract the whole table?” so i said yes but the result table is corrupted; all the first column are concatenated in the first cell. I figured out the cause of the problem is the existing of header column(th tags).
So can anyone help me to extract the table correctly?


#2

Hi @ShaimaaHafez,

When you get "You selected a table cell, would you like to extract the whole table?”, say “No” and then manually select each header and then a corresponding row below that header. Continue on for each column. You’ll see what I mean once you select “No”.

Regards,
Troy


#3

Thank you, but i need the robot to identify the table itself as there are many columns in the tables i want to extract and also i want to make a scalable program to run on many different tables.


#4

Hi @ShaimaaHafez

use OCR instead of full text or native.

regards,
venkatesh.


#5

Thank you, but I am using Data Scraping not Screen Scraping.


#6

Try Extract Structured data Activity for this. It might solve your problem


#7

Same problem, in Extract Structured data activity(which is also part of data scraping), it defines extract metadata as
“”
but because the first column is a header, it concatenate all of its cells in the first cell as part of column names and the table is generated corrupted.