Identify status code from HTTP Request Activity

Hello everyone,

I am trying to get status code of web page from http request activity by
1.Logiing to a website .

2.Extracting hyperlinks from the webpage and storing them in an excel.

3.For each hyperlink extracted in step 2 i need to verify that when i navigate to this url ,the web page content should not contain messages such as access denied,page not found etc.

4.At present what i am doing is i am keeping http request activity inside for each loop and passing the url to get the status code.

5.What i am observing is everytime the status code is 200 only even when the webpage says page not found or access denied.

Currently i am using IE and i can see that genrally the status of the page is 200 when i press f12 go to networks and then refresh the page again new code such as 404 in case of page not found gets added.
Can you please help me on this one ?

Hi @mudit

You could simply use the HTTP Request activity with the link, here’s an example output for a non-existent url:
image

You could even get the Status Code directly from the properties of the activity:
image

1 Like

Hi

Appreciate your response.
But i am already using the above approach .Problem is the http activity is not giving me the proper code .For example if i get an error called as page not found i still get the status code as 200 which ideally should be 404.
I even filtered the body to text/html but still no good.
However when i am checking for external web pages apart from my application i am getting proper codes say for example facebook.com

As u can see i was expecting an error 404 which was there in network properties of IE but studio returned me 200

I see. Does the body string contain the phrase “Page not found”? I guess you could then work around it this way.

i can use page title in order to identify the content.But i want to know the root cause of why this is happening

how do i extract the body of the response from http ?

You can get it by utilizing this field here:
image

As to the cause, I really don’t know. Is it possible that the website involves an IFRAME which gives 404, but the actual page URL you type in does not?