screen scraping

RobcPettit · May 11, 2006

Hi im using
// Open the requested URL
WebRequest req =
WebRequest.Create("http://www.betfairgames.com/?rfr=1738&sid=77&pi.localeId=en_GB&pi.regionId=GBR");

// Get the stream from the returned web response
StreamReader stream = new
StreamReader(req.GetResponse().GetResponseStream());

// Get the stream from the returned web response
System.Text.StringBuilder sb = new
System.Text.StringBuilder();
string strLine;

// Read the stream a line at a time and place each one
// into the stringbuilder
while ((strLine = stream.ReadLine()) != null)
{
// Ignore blank lines
if (strLine.Length > 0)
sb.Append(strLine);

}

// Finished with the stream so close it now
stream.Close();

// Cache the streamed site now so it can be used
// without reconnecting later

}
}
to get the html from betfair. The problem Ive got, and Ive spent hours
googling, is that I cant work out what to do with it. Sound stupid I
know. 2 problems really, the info I want is the results, which I think
are not in html but in text. And I cant work out how to grab the text.
I think the site is xhtml. Please can anyone suggest some clear info. I
realise from googling this topic is vast.
Regards Robert

Nicholas Paldino [.NET/C# MVP] · May 11, 2006

Looking at the page, it appears to be HTML, not XHTML, and not text.

What you need to do is parse this, and then access the elements of the
Document Object Model in order to determine the values that you want.

You can use MSHTML for this (and probably should, if you are not going
to display the responses) through COM interop.

Hope this helps.

RobcPettit · May 11, 2006

Thankyou for your reply, Ill read up on these.
Regards Robert

alex_f_il · May 13, 2006

You can also try SWExplorerAutomation (http:\\webunittesting.com).

StreamReader poor performance	9	Jun 29, 2004
WebResponse Problems	1	Aug 4, 2003
HttpWebResponse: Problems reading response stream - The chunk length was not valid exception	8	Nov 12, 2003
HttpWebRequest/.ServerXMLHTTP.4.0 - different results	8	Nov 9, 2005
HttpWebRequest POST result is not the same as POST via ServerXMLHT	1	Nov 8, 2005

screen scraping

RobcPettit

Nicholas Paldino [.NET/C# MVP]

RobcPettit

alex_f_il

Ask a Question

Similar Threads