I was trying to scrape this URL https://www.wallysplace.se/ to get all PIzzas and their ingredients.
But it is segmented in different chunks.
Any advice is welcome and this is just for fun.
scrapping segmented tables are dependend to the webapages, but can often be achieved by combining:
- dynamic index / selectors
- find children.
So first thing is to grap as much as possible (for the left, the right is quite similar)
then we check for more possible iterators eg.
all in all it does need some more detail analysis to find out the best retrieval strategy.
Let us know in case of you need more help on this
How about HtmlAgilityPack.
- download HtmlAgilityPack <-PackageManager
- add namespace
- new variable HtmlDocument
Two Way 1. With Html Text
2. HtmlWeb <- refer aiglity pack site
- Select Node by XPath
xpath of pizzor div : //div[@class=‘et_pb_row et_pb_row_3’]//div[@class=‘dsm_pricelist_item_wrapper’]
- extract Inner Text <- ForEach Activity
node <- foreach item : HtmlNode Class
assign String Array and …
node.InnerText.Trim.Split(Environment.NewLine.ToCharArray,StringSplitOptions.None).Select(Function(s) s.trim).ToArray <- Assign Activity