I am trying to get status code of web page from http request activity by
1.Logiing to a website .
2.Extracting hyperlinks from the webpage and storing them in an excel.
3.For each hyperlink extracted in step 2 i need to verify that when i navigate to this url ,the web page content should not contain messages such as access denied,page not found etc.
4.At present what i am doing is i am keeping http request activity inside for each loop and passing the url to get the status code.
5.What i am observing is everytime the status code is 200 only even when the webpage says page not found or access denied.
Currently i am using IE and i can see that genrally the status of the page is 200 when i press f12 go to networks and then refresh the page again new code such as 404 in case of page not found gets added.
Can you please help me on this one ?
Appreciate your response.
But i am already using the above approach .Problem is the http activity is not giving me the proper code .For example if i get an error called as page not found i still get the status code as 200 which ideally should be 404.
I even filtered the body to text/html but still no good.
However when i am checking for external web pages apart from my application i am getting proper codes say for example facebook.com
As to the cause, I really don’t know. Is it possible that the website involves an IFRAME which gives 404, but the actual page URL you type in does not?