How to do screen scraping where the site requires a log in

  • Thread starter Thread starter Alan Silver
  • Start date Start date
A

Alan Silver

Hello,

I would like to pull some information off a site that requires a log in.
I have a subscription to a premium content site, and I would like to be
able to do a few automatic requests instead of having to load the site
manually in a browser.

I have seen plenty articles that explain how to do screen scraping in
..NET, others that describe how to do it via a POST, but I couldn't find
any that covered my scenario.

Basically the problem is that the code would first have to call the home
page, then fill in the log in entries and post the page back. Then, the
code would need to hang on to the cookie (which is what I assume they
are using) so that when it does another request (GET would be fine
here), the site will allow the request and not think the requester is
not logged in.

This all works fine in a browser, as the browser handles the cookie for
you, but the code examples I have seen seem to use completely stateless
requests (ie no cookies preserved), so it wouldn't work for a site like
this.

Any ideas? TIA
 
You can try SWExplorerAutomation (SWEA) (http:\\webunittesting.com).

Thanks, looks interesting. The only shame is that I prefer to write my
own code rather than use someone else's. You don't get to understand
what's going on when you use a 3rd party app to do the grunt work.
 
Back
Top