Scrap data from a web page and check if the whole scraped data is present in database or not

hi,
i want to scrap comments from Amazon.in website and store the data in database. i want to do scraping everyday. if i scrap all comments everyday it will be a time taking job.


this is my result data of first page.
image
so i want to scrap the first page then i want to check the whole scrap data from 1st page is present in data base or not. if all data from first page is present in database then close the tab. if one or more unmatched row found in the first page then i want to go for next page and scrap the second page and check the scraped data from second page is present in data base or not. And the same process i want to repeat. how can i do this?
Thank u.

Hi @arijit1213 :wave:

Do you have a unique identifier to match or compare your scraped data with the one in the ID? Like ID? If so, this can be done in a simpler way.

hi @monsieurrahul
i want compare the whole row.
e.g- (title,posted_by,post_date,rate,review,product_id )this is column name present in my database table and the scraped data. i want to compare the whole row.

To compare the whole row, take your DTs - the one from DB, and the one you have scraped, iterate them using a for loop and then compare using row1.ItemArray = row2.ItemArray

1 Like

hi @monsieurrahul
thank u.
sir,
if i want to compare (tiitle,posted by,review and product_id) column then what can i do?

In that case, same 2 for loops, and compare each row manually like this:

foreach row in DatabaseDT{
foreach row2 in ScrapedDT{ 
if(row("Title").ToString = row2("Title").ToString){
//Whatever you want to to
}
}
}
1 Like

hi @monsieurrahul
can you send me the workflow?

Your process involves DB so, it’s a bit hard for me to do that so, I’ll give you a workflow demonstrating the row comparisons. You can do the same with the DT you are fetching from the DB.

Sounds good @arijit1213?

1 Like

@monsieurrahul
yes.
Thank u.

hi @monsieurrahul
can you please give me the the workflow by demonstrating
foreach row in DatabaseDT{
foreach row2 in ScrapedDT{
if(row(“Title”).ToString = row2(“Title”).ToString){
}
}
}
this part.

Here you go @arijit1213: Arijit-Forum.xaml (10.2 KB)

Get back if you are facing any issues!

Cheers!