Help me with sockets/TCP

J

Jeff Johnson

I'm going to be giving a presentation to my co-workers about the low-level
workings of the HTTP protocol. To demonstrate what's going on, I wrote a
simple app that uses a TcpClient to send a GET request to a server and I'm
displaying the raw response. (This is why I couldn't use an HttpWebRequest;
it doesn't appear to provide any means of getting at the entire raw
response.)

I'm having difficulty getting the entire response from the Web server, and
it seems to be a matter of timing. Let's say for example that my
ReceiveBufferSize is 8192 and the response is 12000 bytes long, with about
500 of those bytes simply being the headers. What I output to the UI's text
box in normal run mode is only about 7600 bytes of content; I lose the rest.
However, if I set a breakpoint in the code that reads the NetworkStream and
step through, I end up getting everything.

I tried to duplicate the delay that I was introducing when stepping through
code by adding a spin loop, but that didn't work. Here's the code that sends
requests. It runs in a spearate thread from the UI and I tried to annotate
variables/methods that live outside this code. Can anyone spot what I'm
doing wrong?

public void SendRequest(object url)
{
// This is a field used to allow the UI to stop the request
_cancel = false;

Uri u;

try
{
u = new Uri(url.ToString());
}
catch
{
throw;
}

TcpClient client = new TcpClient();

client.Connect(u.Host, u.Port);

NetworkStream ns = null;

try
{
ns = client.GetStream();

// _request is the backing store for a property
_request = string.Format("GET {1} HTTP/1.1{0}Accept:
text/*{0}Accept-Language: en-us{0}Accept-Encoding: {0}User-Agent:
Home-grown{0}Host: {2}:{3}{0}Connection: close{0}{0}",
Environment.NewLine, u.PathAndQuery, u.Host, u.Port);

UTF8Encoding encoder = new UTF8Encoding();

byte[] requestBytes = encoder.GetBytes(_request);

ns.Write(requestBytes, 0, requestBytes.Length);

while (!ns.DataAvailable)
{
if (_cancel)
{
return;
}

System.Threading.Thread.Sleep(0);
}

StringBuilder responseBuilder = new StringBuilder();

int bytesRead = 0;

bool continueWaiting = true;

do
{
if (_cancel)
{
return;
}

byte[] respBytes = new byte[client.ReceiveBufferSize];

bytesRead = ns.Read(respBytes, 0, (int)client.ReceiveBufferSize);

if (bytesRead > 0)
{
responseBuilder.Append(encoder.GetString(respBytes));
}

// Spin and wait for data
int spinCount = 0;

while (!ns.DataAvailable)
{
System.Threading.Thread.Sleep(10);

spinCount++;

if (spinCount >= 500)
{
continueWaiting = false;
break;
}
}
} while (continueWaiting);

// _response is the backing store for a property
_response = responseBuilder.ToString();

// If I put a breakpoint up by the do statement and step through, when
_response
// gets filled it has the entire contents of the response (I can tell
because I see the
// </html> tag.
// If I let the code run without breaking, the string in _response is
obviously incomplete.

// This will notify the UI that a response was received
OnResponseReceived();
}
finally
{
ns.Close();
}

client.Close();
}
 
C

Chris Taylor

Hi,

First let me say I would not recommend going this route, but the basic
principal that you should probably try is to parse the headers to get the
Content-Length and then read until you have received the specified amount of
data, it there is no Content-Length then you should read until the
connection is closed which indicates the server has sent all the data.

Too be honest, I would suggest that you use a tool like Fiddler, it is easy
to use, human readable and you can demonstrate many more scenarios for
example you might what to show the basic authentication handshake etc.

http://www.fiddler2.com/fiddler2/

Hope this helps

--
Chris Taylor
http://taylorza.blogspot.com
http://dotnetjunkies.com/weblog/chris.taylor



Jeff Johnson said:
I'm going to be giving a presentation to my co-workers about the low-level
workings of the HTTP protocol. To demonstrate what's going on, I wrote a
simple app that uses a TcpClient to send a GET request to a server and I'm
displaying the raw response. (This is why I couldn't use an
HttpWebRequest; it doesn't appear to provide any means of getting at the
entire raw response.)

I'm having difficulty getting the entire response from the Web server, and
it seems to be a matter of timing. Let's say for example that my
ReceiveBufferSize is 8192 and the response is 12000 bytes long, with about
500 of those bytes simply being the headers. What I output to the UI's
text box in normal run mode is only about 7600 bytes of content; I lose
the rest. However, if I set a breakpoint in the code that reads the
NetworkStream and step through, I end up getting everything.

I tried to duplicate the delay that I was introducing when stepping
through code by adding a spin loop, but that didn't work. Here's the code
that sends requests. It runs in a spearate thread from the UI and I tried
to annotate variables/methods that live outside this code. Can anyone spot
what I'm doing wrong?

public void SendRequest(object url)
{
// This is a field used to allow the UI to stop the request
_cancel = false;

Uri u;

try
{
u = new Uri(url.ToString());
}
catch
{
throw;
}

TcpClient client = new TcpClient();

client.Connect(u.Host, u.Port);

NetworkStream ns = null;

try
{
ns = client.GetStream();

// _request is the backing store for a property
_request = string.Format("GET {1} HTTP/1.1{0}Accept:
text/*{0}Accept-Language: en-us{0}Accept-Encoding: {0}User-Agent:
Home-grown{0}Host: {2}:{3}{0}Connection: close{0}{0}",
Environment.NewLine, u.PathAndQuery, u.Host, u.Port);

UTF8Encoding encoder = new UTF8Encoding();

byte[] requestBytes = encoder.GetBytes(_request);

ns.Write(requestBytes, 0, requestBytes.Length);

while (!ns.DataAvailable)
{
if (_cancel)
{
return;
}

System.Threading.Thread.Sleep(0);
}

StringBuilder responseBuilder = new StringBuilder();

int bytesRead = 0;

bool continueWaiting = true;

do
{
if (_cancel)
{
return;
}

byte[] respBytes = new byte[client.ReceiveBufferSize];

bytesRead = ns.Read(respBytes, 0, (int)client.ReceiveBufferSize);

if (bytesRead > 0)
{
responseBuilder.Append(encoder.GetString(respBytes));
}

// Spin and wait for data
int spinCount = 0;

while (!ns.DataAvailable)
{
System.Threading.Thread.Sleep(10);

spinCount++;

if (spinCount >= 500)
{
continueWaiting = false;
break;
}
}
} while (continueWaiting);

// _response is the backing store for a property
_response = responseBuilder.ToString();

// If I put a breakpoint up by the do statement and step through, when
_response
// gets filled it has the entire contents of the response (I can tell
because I see the
// </html> tag.
// If I let the code run without breaking, the string in _response is
obviously incomplete.

// This will notify the UI that a response was received
OnResponseReceived();
}
finally
{
ns.Close();
}

client.Close();
}
 
J

Jeff Johnson

That said, some thoughts on the code you posted:

Thanks for the suggestions. I used some (mainly the decoder) and fiddled
around with some other things. Here's the revised function which finally
worked:

public void SendRequest(object url)
{
Uri u;

try
{
u = new Uri(url.ToString());
}
catch
{
throw;
}

TcpClient client = new TcpClient();
client.ReceiveTimeout = 5000;

client.Connect(u.Host, u.Port);

NetworkStream ns = null;

try
{
ns = client.GetStream();
ns.ReadTimeout = 10000;

_request = string.Format("GET {1} HTTP/1.1{0}Accept:
text/*{0}Accept-Language: en-us{0}Accept-Encoding: {0}User-Agent:
Home-grown{0}Host: {2}:{3}{0}Connection: close{0}{0}",
Environment.NewLine, u.PathAndQuery, u.Host, u.Port);

UTF8Encoding enc = new UTF8Encoding();

byte[] requestBytes = enc.GetBytes(_request);

ns.Write(requestBytes, 0, requestBytes.Length);

StringBuilder responseBuilder = new StringBuilder();

Decoder dc = enc.GetDecoder();

int bytesRead = 0;

do
{
byte[] respBytes = new byte[client.ReceiveBufferSize];
char[] decoded = new char[client.ReceiveBufferSize];

try
{
bytesRead = ns.Read(respBytes, 0,
(int)client.ReceiveBufferSize);

if (bytesRead > 0)
{
int bytesUsed;
int charsUsed;
bool completed;

dc.Convert(respBytes, 0, bytesRead, decoded, 0,
client.ReceiveBufferSize, false,
out bytesUsed, out charsUsed, out completed);

responseBuilder.Append(decoded, 0, charsUsed);
}
else
{
break;
}
}
catch
{
break;
}
} while (bytesRead > 0);

_response = responseBuilder.ToString();

OnResponseReceived();
}
finally
{
ns.Close();
}

client.Close();
}

Of course I don't by any means claim that this is a robust solution*; but it
got me all the data and taught me more about data transfer than I previously
knew (which was very little).

And Fiddler is an awesome program. Thanks to Chris for the link.


*It doesn't even handle everything right in the content payload, especially
when servers use "Transfer-Encoding: chunked", because it doesn't realize
that a few bytes are a length indicator and not actual text.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top