async sockets and threading

A

Alexander Gnauck

Hello,

while using async sockets I ran into a strange problem which occurs only
on some machines. I wrote a small demo application which can be used to
reproduce the problem. You can download it from here:
http://alex.ag-software.de/SocketTest.zip

If you press the "Connect in new Thread" button from the test
application you may be able to cause the System.IO.IOException: Unable
to write data to the transport connection. While the other button which
executes the same code without a new thread works fine.

The async BeginWrite gets executed in the async EndConnect of the
socket. So the socket must be connected while the exception tells me it
isn't.
I've debugged and studied my code for hours but can't find the problem.
Is there anything I am doing wrong?

I would be great if you can run the code and tell me if you can cause
the exception or not, and which OS and Framework version you are running.

Thanks,
Alex
 
A

Alexander Gnauck

Peter said:
Please post a concise-but-complete code sample here.

ok, here is the example:

using System;
using System.IO;
using System.Text;

using System.Threading;
using System.Net;
using System.Net.Sockets;

namespace Test
{
class Program
{
static void Main(string[] args)
{
Thread connectThread = new Thread(new
ParameterizedThreadStart(ConnectThread));
connectThread.Start();

Console.ReadLine();
}

static void ConnectThread(object o)
{
Protocol p = new Protocol();
p.Open();
}

public class Protocol
{
private static Socket _socket;
private static NetworkStream _stream;
private const int BUFFERSIZE = 10240;
private static byte[] m_ReadBuffer = new byte[BUFFERSIZE];

public Protocol()
{
}

public void Open()
{
IPHostEntry ipHostInfo =
Dns.GetHostEntry("www.google.com");

IPAddress ipAddress = ipHostInfo.AddressList[0];
IPEndPoint endPoint = new IPEndPoint(ipAddress, 80);

_socket = new Socket(AddressFamily.InterNetwork,
SocketType.Stream, ProtocolType.Tcp);
_socket.BeginConnect(endPoint, new
AsyncCallback(EndConnect), null);
}

private static void EndConnect(IAsyncResult ar)
{
try
{
_socket.EndConnect(ar);
_stream = new NetworkStream(_socket, false);

string header = "GET / HTTP/1.1\r\n";
header += "Accept: */*\r\n";
header += "User-Agent: myAgent\r\n";
header += "Host: www.google.de\r\n\r\n";

Send(Encoding.UTF8.GetBytes(header));
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
}

public static void Send(byte[] bData)
{
try
{
_stream.BeginWrite(bData, 0, bData.Length, new
AsyncCallback(EndSend), null); // <<== exception gets raised here
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
}

private static void EndSend(IAsyncResult ar)
{
_stream.EndWrite(ar);
}
}
}
}
I should point out that just because your connect callback has been
called, that doesn't necessarily mean your socket has been connected.
It just means that the operation has been completed. You need to call
EndConnect() and only proceed with writing to the socket if no exception
occurs and you do in fact get back a valid socket.

I get no exception in EndConnect. In EndConnect I also create the
network stream because I have to work with streams to start the TLS
security layer on the connection later.

Alex
 
B

Ben Voigt [C++ MVP]

Peter said:
Peter said:
Please post a concise-but-complete code sample here.

ok, here is the example: [...]

Thanks for the example. With that in hand, it looks to me as though
the problem is related to garbage collection. In particular, it
appears that the JIT compiler figures out that you don't refer to the
_socket or _stream class members once the thread you started exits,
and it makes those members eligible for garbage collection. And of
course, when that happens (well, once the finalizer is run), the
connection winds up closed.
I added some synchronization to ensure that the Open() method doesn't
return until the EndSend() callback has been executed, and that
reliably makes the problem go away. That seems to confirm the GC as
the culprit.

What if you instead saved the IAsyncResult returned by BeginConnect into a
static variable? Perhaps that is what is being collected (though presumably
the same one gets passed to the completion routine)?
Now, whether that's _correct_ behavior, I'm not entirely sure. It
_seems_ like a .NET bug to me. In particular, those are static
members, and I would have thought all class static members would be
considered a root for the purpose of garbage collection. In fact,
one of the authoritative articles on .NET garbage collection seems to
say just that: " all the global and static object pointers in an
application are considered part of the application's roots" (from
http://msdn.microsoft.com/en-us/magazine/bb985010.aspx).

So, at the very least, for this code example, it seems to me that the
GC is prematurely collecting the objects. If your real-world
scenario also keeps the objects in static members, then the same bug
would be affecting you.

I think the BCL should be keeping these objects alive completely independent
of the user's references from static members. After all, the only living
reference could be in the State parameter to BeginConnect.
Of course, if in your real-world scenario, these objects wind up
referenced by instance members that are in a class instance that
itself becomes unreachable due to the use of the thread, then _that_
would be expected behavior. The fix there would be, of course, to
keep a reference to the instance storing the socket and stream
instances, so that they remain reachable and uncollected.

In that case, presumably the completion routine would be a instance method,
so the AsyncCallback delegate passed to BeginConnect would reference the
instance and (should) keep it alive.
 
A

Alexander Gnauck

Peter said:
Note that I don't even know for sure that this is a GC issue. That's
just my supposition, based on the behavior. It _looks_ like a GC
issue. But frankly, the lifetime of a Thread shouldn't really be
affecting the lifetime of objects not specifically tied to that Thread
anyway. So if the static variables are being collected, not only is
that a bug, but the fact that it's being done so just because a Thread
exited seems also to be a bug to me.

the problem is not related to the static members. When I wrote this test
case I first had a flat console application with all static variables
,functions and callbacks. Then I moved stuff to the Protocol class and
forgot to change the static modifiers. In my real application they are
not static and the behavior is the same.
It's very odd.

yes it is

Alex
 
A

Alexander Gnauck

Peter said:
[...] In my real application they are not static and the behavior is
the same.

Including the workaround?

no, the product is a SDK/library. It would to complex to add the
workaround to the code, because the new thread is created in the end
users code and not my code. Normally threads are not required because
the library is based on async sockets and in the most cases only one
instance of the class is needed in an application.

I filed a bug report at Microsoft Connect. But they are unable to cause
the problem with the provided code. So it does work on many machines and
is crashing only on some machines. So the next question is why does the
GC behave different on different machines or operating systems. It could
be also related to the CPU. I am running XP pro on 2 machines and I get
the exception on both machines with .NET 2.0 and 3.5.

Alex
 
A

Alexander Gnauck

If you figure that out, please follow up here.
i will
Either post a new bug
report that's public, or ask Microsoft to change the status so that the
bug report you already submitted is public.
I asked them to make it public. I keep you informed.

Alex
 
A

Alexander Gnauck

Peter said:
If you figure that out, please follow up here. Either post a new bug
report that's public, or ask Microsoft to change the status so that the
bug report you already submitted is public.

Microsoft is escalating this issue to the appropriate group within the
Visual Studio Product Team for triage and resolution.
But no other response yet.

Alex
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top