read lock needed?

R

Ryan Liu

Hi,

If I have many threads write to a variable(e.g. var++) and another thread
read it on an interval base.

For those writing thread, I know I need lock, or its value could be lower
( even I think it is mostly not going to happen for ++ operation since it is
not something like read a value and wait sometime then write back in
multiple threading environment, BTW, is this understanding right?).

My question, for the reading thread, do I need lock?
If a lock is still a must, what about when I can accept race condition,
just the read value don't to be "too old", e.g. 2 or more less than the its
current value. And new read should never less then old read.

If I don't use lock in read thread, do I have to use volatile for the var,
otherwise it has the potential not to read the current value?

Thanks a lot!
 
B

Brian Gideon

Ryan said:
Hi,

If I have many threads write to a variable(e.g. var++) and another thread
read it on an interval base.

For those writing thread, I know I need lock, or its value could be lower
( even I think it is mostly not going to happen for ++ operation since it is
not something like read a value and wait sometime then write back in
multiple threading environment, BTW, is this understanding right?).

The ++ operator does not produce an atomic read-increment-write action
so it certainly is possible that you'd see a lower (or possibly higher)
value for the variable than what it's suppose to be.
My question, for the reading thread, do I need lock?

Well, technically no. At the very least you do need to use one of the
volatile read mechanisms, but it's best if you go ahead and use a lock
especially if you're already using them for the writing threads.
If a lock is still a must, what about when I can accept race condition,
just the read value don't to be "too old", e.g. 2 or more less than the its
current value. And new read should never less then old read.

That's impossible if you use locks correctly.
If I don't use lock in read thread, do I have to use volatile for the var,
otherwise it has the potential not to read the current value?

That's correct. But again, I'd use a lock
 
J

Jon Skeet [C# MVP]

Ryan Liu said:
If I have many threads write to a variable(e.g. var++) and another thread
read it on an interval base.

For those writing thread, I know I need lock, or its value could be lower
( even I think it is mostly not going to happen for ++ operation since it is
not something like read a value and wait sometime then write back in
multiple threading environment, BTW, is this understanding right?).

My question, for the reading thread, do I need lock?

Yes, if you want to make sure you see a recent value.
If a lock is still a must, what about when I can accept race condition,
just the read value don't to be "too old", e.g. 2 or more less than the its
current value. And new read should never less then old read.

Within one thread, I don't *think* you should ever see one value
followed by a smaller one. Within multiple threads you could.
If I don't use lock in read thread, do I have to use volatile for the var,
otherwise it has the potential not to read the current value?

Using volatile would mean you wouldn't need to lock the reading thread.

An alternative is to use Thread.MemoryBarrier().

However, I'd suggest just using the lock for all access to the
variable: do you have any evidence to suggest this would be a bad idea?
 
R

Ryan Liu

Jon Skeet said:
Yes, if you want to make sure you see a recent value.


Within one thread, I don't *think* you should ever see one value
followed by a smaller one. Within multiple threads you could.


Using volatile would mean you wouldn't need to lock the reading thread.

An alternative is to use Thread.MemoryBarrier().

However, I'd suggest just using the lock for all access to the
variable: do you have any evidence to suggest this would be a bad idea?


Actually, no. I just though lock might more costly than volatile. This is a
client/server appliction, I am expect one day to support 500
clients(threads) at the same time.

Thanks, Jon!

Ryan





 
J

Jon Skeet [C# MVP]

Ryan Liu said:
Actually, no. I just though lock might more costly than volatile.

Yes, it is.
This is a client/server appliction, I am expect one day to support 500
clients(threads) at the same time.

I would anticipate that the cost of locking will still be insignificant
compared with other costs. If the code runs sluggishly with a high
number of threads, profile it and see whether the locks are the
problem. Basically, stick to a simple solution until there's a reason
to move away from it.

(You may well want to consider moving to a model with fewer threads and
asynchronous operations at that point as an alternative strategy, btw.)
 
B

Brian Gideon

Ryan said:
Actually, no. I just though lock might more costly than volatile.

I suppose it is in most circumstances, but have you benchmarked a lock?
I think you might be surprised by how fast it is. Also, it may seem
counter intuitive, but some lock-free strategies can actually be
slower.
This is a
client/server appliction, I am expect one day to support 500
clients(threads) at the same time.

500 isn't a lot, but if you're mapping them 1-to-1 onto threads then
you're choosing a suboptimal design.

Brian
 
R

Ryan Liu

Thanks Jon and Brian,

I am happy both of you mention the design paten for threading.

Can I ask 4 more questions?

1: Is 500 threads considered a lot? Can a cheap or regular server with 2.0G
Hz CPU, 512M-1G server handle it?

2: Client and serer are sending data though TcpClient all the time(real time
short messages), so it is always on connection (I do this or other client
has network problem from time to time in a 60 clients environment). So on
server side, I have one thread for each client and sit on the socket to keep
receiving data from client.

I can not think a way to reduce thread in this situation. Can you give me
some insights?

3: BTW, maybe it is even worse design: to make my job easier, not just
server connect to db, all clients connect to db directly as well. So db
server (most time is on the same machine as application server) also keep
hundreds db connection open. What are the better design for this?

4: You have mentioned benchmark or profile, actually what are the better
tools to do so? And what are the common (free) tools to mointor resource use
and to check network traffic(load/data)?

That are lots of questions. Thanks a lot for any help from you or others in
the group!!

Ryan
 
L

Lucian Wischik

Ryan Liu said:
1: Is 500 threads considered a lot? Can a cheap or regular server with 2.0G
Hz CPU, 512M-1G server handle it?

500 threads are a heck of a lot, especially if they're all active
(i.e. not all blocked).

Each thread by default takes up 1mb of virtual address space for its
stack. So straight off the bat you've wasted 500mb of addres space
(out of a total of only 2gb).

Moreover, their stacks will be scattered over memory, so each context
switch will mean that the cache is useless, so you'll get very bad
cache performance.

Finally, my memory is that each call to CreateThread involves
inter-process communication with one of the windows DLLs (smss?) and
it does this so the OS can keep a track of all threads and tidy them
up when a process terminates. IPC is always slowish.

If your hardware has N cores, then you will always lose performance if
you have more than N active threads.

The only reason to have lots of threads is because it makes your
program logic easier, and because you want your code to run on future
manycore machines without recompilation. Does 500 threads really make
your code easier to design, understand and debug? it's hard to
believe...
 
G

Guest

The only reason to have lots of threads is because it makes your
program logic easier, and because you want your code to run on future
manycore machines without recompilation

That is not the reason to be using threads, I am not sure many people
compile their code thinking it will run faster on multi-core machines in the
future, basicially using a number of threads is useful even on a single core
machine because they improve throughput of your program, if you have many
jobs to process and were doing it serially then if you come across a large
job that takes a long time many smaller jobs have to wait for the large job
to complete, however with threads you can let multiple threads process jobs
pseudo simultaneously so you can process more jobs in the same time. Given
that with 1 core there is really only one thread running at any one time but
because you are allowing multiple items to be processed in a given time it is
a better strategy.

Also threads can help give a "perceived" speed increase, for example loading
a program at the beginning you can allow people to start using the program
while other threads are loading data/initializing in the background.
Although these tasks still take the same amount fof time (or even more due to
multiple context switching) as if you had executed them serially to the user
the program is perceived as loading faster because they can use it quicker.

Mark.
 
L

Lucian Wischik

Mark R. Dawson said:
That is not the reason to be using threads,
basicially using a number of threads is useful even on a single core
machine because they improve throughput of your program

Sure, what you listed are good reasons to use threads -- I was just
talking about the reasons for using LOTS of threads.
 
B

Brian Gideon

Ryan said:
1: Is 500 threads considered a lot? Can a cheap or regular server with 2.0G
Hz CPU, 512M-1G server handle it?

Yeah, it's a lot, but I suspect most machines can handle it. Of
course, that really depends on your definition of handle.
2: Client and serer are sending data though TcpClient all the time(real time
short messages), so it is always on connection (I do this or other client
has network problem from time to time in a 60 clients environment). So on
server side, I have one thread for each client and sit on the socket to keep
receiving data from client.

I can not think a way to reduce thread in this situation. Can you give me
some insights?

There are several design patterns out there that deal with
client-server scenarios. The following is a link to a book I thought
was good. Some of the patterns in the book already exist in the BCL.

http://www.cs.wustl.edu/~schmidt/POSA/POSA2/

Also, take a look at the BeginXXX and EndXXX methods in the BCL.
They're there to help you create scalable applications. The
IAsyncResult object returned by a call to BeginXXX is a variation of
the Asynchronous Completion Token pattern described in the book.
3: BTW, maybe it is even worse design: to make my job easier, not just
server connect to db, all clients connect to db directly as well. So db
server (most time is on the same machine as application server) also keep
hundreds db connection open. What are the better design for this?

I typically open a connection object, execute a DB command, and then
close the connection immediately and let ADO.NET connection pooling do
the rest.
4: You have mentioned benchmark or profile, actually what are the better
tools to do so? And what are the common (free) tools to mointor resource use
and to check network traffic(load/data)?

The performance monitor built into windows is a good start. The better
profiling tools for the .NET Framework aren't free, but I did use an
open source tool a while back. I just can't remember what it was.
Maybe it was NProf.
 
B

Bruce Wood

Ryan said:
Thanks Jon and Brian,

I am happy both of you mention the design paten for threading.

Can I ask 4 more questions?

1: Is 500 threads considered a lot? Can a cheap or regular server with 2.0G
Hz CPU, 512M-1G server handle it?

2: Client and serer are sending data though TcpClient all the time(real time
short messages), so it is always on connection (I do this or other client
has network problem from time to time in a 60 clients environment). So on
server side, I have one thread for each client and sit on the socket to keep
receiving data from client.

I can not think a way to reduce thread in this situation. Can you give me
some insights?

Check out the CCR. It's relatively new and experimental, but it talks
about exactly the sort of problem you're facing:

http://msdn.microsoft.com/msdnmag/issues/06/09/ConcurrentAffairs/default.aspx
 
C

Chris Mullins

I spend pretty much all day, every day, working on a fancy .Net socket
application.

The model you're looking to use (one thread per client) is good for a few
users, but doesn't scale up very well at all. Your best bet is to learn how
to use asynchronous sockets. This will give you the scalability you're
looking for.

http://www.coversant.net/Default.aspx?tabid=88&EntryID=10
 
J

Jon Skeet [C# MVP]

The only reason to have lots of threads is because it makes your
program logic easier, and because you want your code to run on future
manycore machines without recompilation. Does 500 threads really make
your code easier to design, understand and debug? it's hard to
believe...

I don't know - I find it *much* easier to read through the logic of an
application written with synchronous calls in multiple threads than
apps which are designed to be asynchronous.

Of course, it doesn't scale as well, but I certainly don't find it hard
to believe that it's easier to design, understand and debug a
500-thread app than an asynchronous app.
 
C

Chris Mullins

Jon Skeet said:
I don't know - I find it *much* easier to read through the logic of an
application written with synchronous calls in multiple threads than
apps which are designed to be asynchronous.

I've had this exact dicussion with another fellow here in the office, who,
like me, spends most of his time writing asynchronous code.

While I tend to agree that synchronous is easier, there are classes of
problems which I (and he) find easier to debug when they're asynchronous.
When you go async, each chunk is broken up into a very discrete section:
- got the callback, and with it comes our state
- processing
- submitting to next async operation.

When that async callback happens, and we have our state, it's pretty easy to
debug.
Of course, it doesn't scale as well, but I certainly don't find it hard
to believe that it's easier to design, understand and debug a
500-thread app than an asynchronous app.

We've gone back and forth on this one too. It used to be we did prototypes
using lots of threads, and when were were ready to move forward we
"asyncified" them.

At this point, we both find it easier to be async from the very beginning. I
could easily be persuaded that we're just weird and that sync is really
easier....
 
J

Jon Skeet [C# MVP]

Chris Mullins said:
I've had this exact dicussion with another fellow here in the office, who,
like me, spends most of his time writing asynchronous code.

While I tend to agree that synchronous is easier, there are classes of
problems which I (and he) find easier to debug when they're asynchronous.
When you go async, each chunk is broken up into a very discrete section:
- got the callback, and with it comes our state
- processing
- submitting to next async operation.

When that async callback happens, and we have our state, it's pretty easy to
debug.

True. I guess there are fewer race conditions to look out for, too.

I'm looking forward to the CCR coming out, which sounds (from what I
remember) as if it should give us at least *some* of the good points of
both worlds, if not the best :)
We've gone back and forth on this one too. It used to be we did prototypes
using lots of threads, and when were were ready to move forward we
"asyncified" them.

At this point, we both find it easier to be async from the very beginning. I
could easily be persuaded that we're just weird and that sync is really
easier....

I think if I *knew* I'd end up async, I'd start off that way. I also
suspect that with enough practice I'd end up finding it reasonably
straightforward too - it's just a different mindset that takes a lot of
effort to get into at the moment.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top