[HowTo] Data Scraping - Advanced Configuration - Text Field, Image Source, Url, CSS Classname, Hover text

This HowTo introduces on how Data scraping can be configured to retrieve also on non standard information from a web table. After indicating the different data columns with the wizard the extract data definition was post edited and changed to the relevant attributes e.g. value (Text field), src ( Image Source), class (CSS Class Name), tite (Hover Text), href (Url).


Following web table is to use for data scraping and also the non text information should be retrieved.


We are interested on following details:

  • ID
  • Name
  • Task
  • Cercle Type
  • Hover text of cercle
  • Prio info
  • Url

Preperation / Analysis

It always recommended to do a quick check on Browsers web tools (F12) and / or UiEplorer. The table looks like this:

The quick look shows us

  • it is organized in tabular structure based on a table (instead of a div table representation)
  • the different information sources are yellow marked and identified
  • first row with the headers are used within TH tags

So it looks good, lets do the retrieval

Data Scraping configuration

First Column (ID)

Start with data scraping

  • Select Element Dialog - click next
  • click on the first ID Value
  • following dialog is displayed:
  • Click No (Nein) - we want to fine control the retrieval configuration
  • Select Second Element Dialog - click next
  • click on the second ID Value
  • Following Dialog is shown:
  • grafik
  • No url extraction is required, the column name is set later

Following Preview is shown:

Second Column (Name)

  • Click on the preview dialog extract roccrelated data
  • similar to the first column the first element is indicated - first name
  • indicating the second element - second name
  • result is:
  • grafik

Regadles if the selectors are correct or invalid, the empty column values are correct

An empty result is received as the name value is not text in the data cell. The name info is a value in a text field (refer to screenshot above)

Lets adopt the extraction by the following steps:

  • Click Edit Data Definition
  • grafik
  • Validate the extraction result that it is selecting an input
  • Check that the second table call is selected: td idx=‘2’
  • change attribute from text to value:

And validated the new generated preview:

Additonal columns

  • repeat the steps from first column and add the other columns by right indicating the column first element value, second element value
  • Click on Edit Data Definition and modify as following:


Final Result

The datatable with the extracted values. The PrioInfo values are the different css classes. In a conversion run also this info can be mapped e.g. to …circle-up = HIGH etc.


  • After each editing the extract data definition copy the result / modified extract metadata XML into the clipboard
  • Do at first the additions / selection of the different columns and edit the extract data definition on the end.
    • Reason: after modifying the extract data definition and adding the a new column the modifications are reset. Thats why also the part results are copied to the clipboard
  • in case of suspicious preview results after heavy editing rounds stop the wizard and restart it again


HowTo_TableFieldClassImgLink.zip (175.5 KB)


For questions on your retrieval case open a new topic and get individual support

Extract data table - get a specific attribute instead of the text in the table
Data Scraping cannot get value from web tag textarea
Loop Through and Click Through the Tickets in Browser
How can I extract the data from the web scraping based on the background color
Extract table with radiobutton from website
Data scraping using selector
Fetch complete reviews for a set of doctors
Color identification in data scramping
Detect color and put data
Scrape whole table with data scrapping issue
Cannot get the link
Table Extraction, half records Image URL, and half is not!
Image scraping
Extracting and writing data from Input box
Dynamic selector for Get Attribute ("src")
Query on how to check the specific information in the screen
Data Scraping a web data table containing images without knowing which columns they are going to be in
Unable to scrap image from the website
Web application selectors issue
Scrapping images from HTML tables and save to excel sheet/datatable
Encoding Problem
Extract different data from website - hotels.com
Datascraping hover text
Download all the files on the webpage
Catch Selector error from Get Attribute activity
What is the best method for getting currency rates from a website?
Not able to scrape Image URLS
Image URL extraction from Amazon Web Page
Screen Scraping & Data Table
Duplicates with data scraping wizard
Data Scraping - columns with the same ID
How to manage multiple identical images in a web table
Extract Structured Data returns duplicate rows in the DT

Cool article! I moved it to our FAQ category.

1 Like

Hi @ppr Will the Steps be the same even if the Table representation is in a Div table format ?

we did it also in some projects where the data was organized in rows and columns e.g. represented by divs.

The very important part is to get defined a reliable row iterator selector and consistent selector to the correlated data within the extract data definition.

Awesome! Thanks a lot!

Great article! Thank you.
There is a way to extract “everything” that is in the block instead of extracting the Text, Class, Value, etc.
In my case, each element contains a “structured data” insde (4 elements). But sometimes 1 element is missing so I would like to extract everything (Everything=source code) so I can past it manually.
Any idea how can I do that?