O
omyek
I'm trying to mimic the browsing of a webpage using an HttpWebRequest.
I've had a lot of luck with it so far, including logging into pages,
posting form data, and even collecting and using cookies.
However, I ran into a scenario that I'm baffled by. I have a website
which requires a user to login. This is nothing new and I was able to
successfully log in.
For our case, let's say the URL to the login page is http://login.html
After the login page, I'm taken to an interim page where I have to
click on a link to actually access the "member" features. The link to
this page happens to be the same as the login link, http://login.html
Now, on the code level, like i said, I was able to login just fine and
be taken to this interim page. Since I know the "members" page is the
same as the login link, I setup my HttpWebRequest to go to the same
URL. The problem is, I'm taken back to the login page instead of the
members page.
As far as I can tell, there are no cookies involved. But somehow, the
interim page knows I've logged in so when I click on the
http://login.html link again, it takes me to the member page. I just
don't know what. The link is just a simple <a href> with no
javascript or anything else fancy. The HTML source is also simple,
with nothing more than HTML markup. (i say there are no cookies
because my code says there are none and cause my cookies folder on my
hard drive doesn't show any)
Does anybody have any clues? If it helps, the http://login.html page
is really a cgi page like this, http://login/cgi-bin/login.cgi.
Possibly something unique to cgi?
The code I'm using can be found below. It's called twice, once to log
in (i supply the necessary post query) and a second time to try and
access the member area. both times using the same url. the only
difference is, the first time i do a post to login and the second time
i just navigate to the url.
So to summarize, what am I not saving/looking at that's causing my
second HttpWebRequest to "fail" and take me back to the login page
even though I'm logged in?
void navigateTo(string url, bool post, string query)
{
HttpWebRequest req = null;
try {
// setup our request to the given url
req = (HttpWebRequest)HttpWebRequest.Create(url);
// we have to have a cookie container if we want to get cookies back
req.CookieContainer = new CookieContainer();
if(cc != null && cc.Count > 0) {
// let's add these cookies
req.CookieContainer.Add(cc);
}
if(post) {
// now setup the properties we'll need to do a POST
req.ContentLength = query.Length;
req.Method = "POST";
req.ContentType = "application/x-www-form-urlencoded";
// write our query to the stream
using(StreamWriter sw = new
StreamWriter(req.GetRequestStream())) {
sw.Write(query);
}
}
// get the response from the url
HttpWebResponse resp = null;
try{
resp = (HttpWebResponse)req.GetResponse();
// save any cookies we received
if(resp.Cookies.Count > 0){
// looks like we have cookies, let's try and combine them with
the other cookies we've collected
if(cc == null) {
// this is the first time we've run across cookies
cc = resp.Cookies;
}
else {
// let's merge our new cookie collection with our old
one
CookieCollection temp = new CookieCollection();
foreach(Cookie c in cc) {
bool match = false;
foreach(Cookie newc in resp.Cookies) {
if(c.Name == newc.Name) {
// we have two matching cookie
names, so let's take the new one
temp.Add(newc);
match = true;
break;
}
}
if(!match) {
// we didn't find a matching cookie in
our new cookie batch, so let's just add the old one
temp.Add(c);
}
} // end foreach
// assign our new cookies to our variable
cc = temp;
} // end else
}
catch {
}
finally {
// close our resp
if(resp != null) {
resp.Close()
}
}
} // end function
I've had a lot of luck with it so far, including logging into pages,
posting form data, and even collecting and using cookies.
However, I ran into a scenario that I'm baffled by. I have a website
which requires a user to login. This is nothing new and I was able to
successfully log in.
For our case, let's say the URL to the login page is http://login.html
After the login page, I'm taken to an interim page where I have to
click on a link to actually access the "member" features. The link to
this page happens to be the same as the login link, http://login.html
Now, on the code level, like i said, I was able to login just fine and
be taken to this interim page. Since I know the "members" page is the
same as the login link, I setup my HttpWebRequest to go to the same
URL. The problem is, I'm taken back to the login page instead of the
members page.
As far as I can tell, there are no cookies involved. But somehow, the
interim page knows I've logged in so when I click on the
http://login.html link again, it takes me to the member page. I just
don't know what. The link is just a simple <a href> with no
javascript or anything else fancy. The HTML source is also simple,
with nothing more than HTML markup. (i say there are no cookies
because my code says there are none and cause my cookies folder on my
hard drive doesn't show any)
Does anybody have any clues? If it helps, the http://login.html page
is really a cgi page like this, http://login/cgi-bin/login.cgi.
Possibly something unique to cgi?
The code I'm using can be found below. It's called twice, once to log
in (i supply the necessary post query) and a second time to try and
access the member area. both times using the same url. the only
difference is, the first time i do a post to login and the second time
i just navigate to the url.
So to summarize, what am I not saving/looking at that's causing my
second HttpWebRequest to "fail" and take me back to the login page
even though I'm logged in?
void navigateTo(string url, bool post, string query)
{
HttpWebRequest req = null;
try {
// setup our request to the given url
req = (HttpWebRequest)HttpWebRequest.Create(url);
// we have to have a cookie container if we want to get cookies back
req.CookieContainer = new CookieContainer();
if(cc != null && cc.Count > 0) {
// let's add these cookies
req.CookieContainer.Add(cc);
}
if(post) {
// now setup the properties we'll need to do a POST
req.ContentLength = query.Length;
req.Method = "POST";
req.ContentType = "application/x-www-form-urlencoded";
// write our query to the stream
using(StreamWriter sw = new
StreamWriter(req.GetRequestStream())) {
sw.Write(query);
}
}
// get the response from the url
HttpWebResponse resp = null;
try{
resp = (HttpWebResponse)req.GetResponse();
// save any cookies we received
if(resp.Cookies.Count > 0){
// looks like we have cookies, let's try and combine them with
the other cookies we've collected
if(cc == null) {
// this is the first time we've run across cookies
cc = resp.Cookies;
}
else {
// let's merge our new cookie collection with our old
one
CookieCollection temp = new CookieCollection();
foreach(Cookie c in cc) {
bool match = false;
foreach(Cookie newc in resp.Cookies) {
if(c.Name == newc.Name) {
// we have two matching cookie
names, so let's take the new one
temp.Add(newc);
match = true;
break;
}
}
if(!match) {
// we didn't find a matching cookie in
our new cookie batch, so let's just add the old one
temp.Add(c);
}
} // end foreach
// assign our new cookies to our variable
cc = temp;
} // end else
}
catch {
}
finally {
// close our resp
if(resp != null) {
resp.Close()
}
}
} // end function