Hello guys, it’s me again Just had a thought, is it possible to scrape webpage through the “View page source” option? It would be way easier and faster if possible. Waiting for you’r response, thanks
May I know what youa re trying to acheive by that?
Basically you can get whole of the web page by using get attribute and innerhtml activities on the main or complete page
Cheers
Hello Anil G,
You see, I’m trying to scrappe one of our parters page. They sell car parts.
While scrolling through the list, there’s none of the information that I need. With that sayed, I would need to go page by page for a million times to get the data that I need. But when I click on the “View page source” button, I can find the info that I need pilled up in rows.
Basically I would love to write a robot, that would go to a page that I have written down in excel, then when he is in the page, go to view page source and then find the rows that he needs and copy them to excel.
The thing is, that when I am scrolling through the list of products, I dont see the info that I need to take. I only see it when I go to a products inner url. But when you click on View page source in the list, you can see the info that you need
Maybe I am missing something? Maybe there is a better and more efficient way to do this? Please let me know.
Try saving the website page as html doc and read the data using read text file…then if the data comes in we can try with regex or tags to identify the required rows
cheers
In addition to:
We can also get the HTML e.g. from outerhtml / innerhtml attribute directly and do our custom extraction / parsing… with the help of
For alternates/variations also have a check of:
This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.