Failed to dynamically scrape data from the Ford website

Hello Good People,

I need help with scrapping data from this website, https://www.ford.com/support/recalls/ dynamically.

You can try with these two input texts:

1FTPX12524NB72614
1FMCU9G61LUA44582

image

Click See Recalls after inputting the VIN text and then the Recalls details page will open and then scrape just the Description text dynamically for both the input texts.

Other Recall details include; Issue Date, Description, Safety Risk, Remedy, Campaign / NHTSA#.

Kindly advise on how I can approach this issue with the selectors.

Regards,
Kakooza Allan Klaus

the result is an alternating label row / value row structure
grafik

Data scraping / table extraction is not handling this type of structure

one option:

find children - filter on rows

  • loop over the result and construct dynamic a datatable / dictionary / others also possible

We also can transform a helper structure (e.g. a dictionary) later into another format (e.g. a datatable)

I wasn’t actually using table extraction. I was using the get activities then I indicated the element on screen.

I’ll try and explore your option.

if you want to give a try at the following exploiting data scraping (as an alternate option)

extract text and class of each row div (datascraping)

<extract>
	<column exact="1" name="Column1" attr="text" name2="Column2" attr2="class">
		<webctrl tag="div" idx="2"/>
		<webctrl tag="div" idx="1"/>
		<webctrl tag="div" idx="1"/>
		<webctrl tag="div"/>
	</column>
</extract>

grafik

and postprocess it later to a structure of interest

1 Like

It looks impressive; let me implement it and get back to you.

@ppr
I did implement this one but making it dynamic is where am getting the issue.
I’ve replaced some id with a variable it still it’s not working as excepted.

image

Please advise.

we would recommend to specify / clarify the expected output. In common cases it could be e.g. a table structured like

Issue Date | Description | Safety Risk | Remedy | Campaign/NHTSA#

we would assume, different recalls can have different structures as well

As also mentioned:

So we can go for the generic approach (option datascraping) and then process it to the desired output. In general it is a looping and treating the found label as Column / dictionary key and the following content as value.

This can be done classically with a for each datarow and/or also have a potential to get it compacted with LINQ.

Just let us know on which target output structure / format you are interessted and we can support addresssing this for the next steps

Watch the loom below to get more context about the issue am having

1 Like

First of all a big :+1: for this extraordinary example on how the requestor is taking care to get understood when some parts a maybe not be understood by the other side

for datascraping we do have the selector to the jumpin element and the extract metadata config for the extraction.

below a are possible settings to find a generic construction of a generic selector to the jumpin element:
VIN:1FTPX12524NB72614
grafik

VIN:1FMCU9G61LUA44582

as you can see we can abstract it to the same common paths / structure definitions

1 Like

Thanks for the appreciation, buddy.
Thank you as well for your help; it did work.

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.