"Peter Duniho" <
[email protected]> дÈëÏûÏ¢ÐÂÎÅ
[email protected]...
[...]
Long time ago, I remember if I take out sleeps, then the server will be
very busy and slow. Back then, I only have 16 clients connect to the
server. Now I have 120.
Current code looks like this, old code for 16 clients should be similar:
while(true){
if(!stream.DataAvailable) {Thread.Sleep(200); continue};
int got = stream.Read(buffer, offset, size);
if(got ==0) {sleep(200);continue};}
//deal with data;
//then send response back using ThreadPool;
}
Right. This is polling, and this is awful.
Calling Thread.Sleep() helps avoid this particular thread from completely
consuming all available CPU time, and from starving other threads. But
the polling causes your context switching to actually increase, because if
you'd use a different i/o mechanism, that thread would be able to not run
at all until there was actually something to do. As it is, it has to wake
up every 200 ms just to see if there's data to be read, which causes a
context switch even when there's no data to be read.
Of course, the other bad thing that sleeping does is that it forces that
thread to wait as long as 200 ms before it can do any work, increasing
latency.
The one good thing I can say about that code is that at least you only
sleep when you believe you have nothing to do (though, a 0-byte read means
that the stream has closed, which you don't seem to deal with in the code
sample above). So at least the thread keeps the CPU as long as it can,
when it has work to do.
But otherwise, it's a terrible way to do i/o.
In the async code, I don't use while(true), just
BeginReadCallBack()
{
EndRead();
deal read msg;
BeginRead(,,, new AsyncCallBack(BeginReadCallBack)); //call function
self
}
and no need sleep anywhere. Is this right way? I heard async coding is
complex, mine is so simple and make me doubt I write in wrong way.
I know.

But it really is just that simple. It's one of the reasons I
like the async API so much. Asynchronous coding can be complex in the
sense that you have to mentally accomodate the possibility of
multi-threaded access to data structures. But since you're using
threading already, you already have this complexity in your code, but
without the inherent advantage of simplicity that the async API otherwise
provides.
Pete, From your response, I understand async I/O is best, dedicated
threads(120 threads) is better. But I am still not confident to take out
sleep from each dedicated thread. I am afraid it will keep server busy.
No. The async API is essentially a blocking API. That's one of the
things that makes it so useful. It has the same advantage that the
regular blocking API has (that is, you aren't consuming any CPU resources
unless there's actual i/o work to do) but the same advantage that a
multi-threaded implementation has: you can easily handle a relatively
arbitrary number of clients with code that is essentially the same as if
you were dealing with a single client (that is, the bulk of the code looks
the same as if you only had to deal with one client instead of many).
With the async API, there is no "dedicated thread". A thread is assigned
as needed to each i/o operation that completes. But it's done in a much
more efficient way than your current way of dealing with writes. You are
queuing each write operation to the thread pool, which adds a lot of
overhead. But the async API has threads that are specifically for dealing
with i/o operations (in this sense they are "dedicated", but you're not
the one dedicating them

), and they just pull completed i/o operations
out of a queue as long as they exist.
If the queue is empty, there's nothing to do and Windows doesn't run any
of the threads. They simply block, consuming no CPU resources. If a
given thread already has the CPU, and there are i/o operation completions
in the queue, Windows will let that thread continue to run for as long as
is practical, rather than switching to a different thread.
Now, even if you dedicate a single thread to each client, you can avoid
the problems of the polling implementation you posted, by using the
blocking API without checking for data availability. Just call
Stream.Read(). It won't return until there's data available to read, and
the thread will not consume any CPU resources. Windows knows that the
thread can't do anything until the call to Read() completes, and won't
schedule that thread until then.
But using the async API you will avoid one problem that even the correct
"dedicated single thread/client" implementation would have: context
switches related to having multiple clients to service. With one thread
dedicated to each client, if you have i/o operations for multiple clients
that have completed, you will still be forced to have a context switch to
deal with each client, because each thread knows about only one client.
But using the async API, Windows is able to keep using the same thread to
handle i/o completions on multiple connections, avoiding any context
switches related to dealing with multiple clients.
I thought if in each loop of each thread, if it sleeps longer, there
will be
less context switch. While one thread sleeps, it can give other threads
more
time to run, no need swith back to or the first thread. -- this is
totally
wrong idea?
Yes. It's not clear from your post whether that loop exists in a single
thread that manages multiple connections, or is being executed in multiple
threads, one thread per connection. But either way, the big problem with
the loop is that you are explicitly checking the DataAvailable property,
rather than just calling Read().
If you would just call Read(), then the thread would remain blocked until
there's data to be returned, and would not use any CPU time at all. But
as it is now, you only call Read() when you expect it to return right
away, which means that Windows has to keep scheduling that thread for
execution. Each thread is pretty much guaranteed to have to run 5 times a
second, even for a thread that's dealing with a connection that doesn't
have any i/o happening.
In this scenario, calling Thread.Sleep() is certainly better than not
calling Thread.Sleep(). But you shouldn't be in "this scenario" in the
first place. Polling -- that is, checking the DataAvailable property --
isn't useful, and it causes the performance issues you're seeing.
Personally, I like the async API and that's how I'd do this. But,
assuming that you already have a dedicated thread for each connection, you
may be able to fix things just by taking out the check for DataAvailable,
taking out the calls to Thread.Sleep(), and taking out the queuing of
write operations to the ThreadPool (just do the writes from the same
thread that is handling reads). In other words, most of what's wrong with
your current implementation may well just be that you have too much code.
Pete