hashtable - thread safety clarification

J

James

I read the following in the help for VS C# Express edition:

"Hashtable is thread safe for use by multiple reader threads and a single
writing thread."

if I am using a hashtable just like described above, multiple threads read
from in but only one thread changes it, should I still be declaring it as
'volatile'?
 
P

Peter Morris

Declaring it as volatile is only relevant if you do this

Hashtable X;


X = new Hashtable();
X = new Hashtable();
X = new Hashtable();

The reference is volatile, not the contents.
 
J

James

got it, very nicely explained. Thanks Peter.

Peter Duniho said:
You can't declare the Hashtable itself volatile. You can only declare a
variable referencing the Hashtable as volatile. And doing so only affects
the usage of that variable, not the Hashtable itself. I.e. using
"volatile" ensures that code that reads the variable (not the Hashtable,
but the reference to the Hashtable) will correctly see changes made to the
variable by code in another thread that writes to it before the code that
reads the variable is executed.

In other words, even if you had multiple writers to the Hashtable,
declaring a variable that references the Hashtable as volatile would have
zero effect on the thread-safety of the Hashtable itself.

Pete
 
B

Ben Voigt [C++ MVP]

James said:
I read the following in the help for VS C# Express edition:

"Hashtable is thread safe for use by multiple reader threads and a single
writing thread."

if I am using a hashtable just like described above, multiple threads read
from in but only one thread changes it, should I still be declaring it as
'volatile'?

As others have pointed out, volatile won't help you here.

In addition, since Hashtable predates ReaderWriterLock, I suspect the
documentation should say "multiple reader threads OR a single writing
thread".

Looking at the implementation with Reflection confirms there's thread-safety
code and I can't see any problems with it after looking for a while. In
that case the docs are correct, you can write with one thread while reading,
which seems to be one significant advantage of Hashtable over
Dictionary<K,V>
 
B

Ben Voigt [C++ MVP]

But, the Hashtable already has a built-in performance issue when dealing
with value types (and it appears from a later post the OP is in fact doing
that) as compared to Dictionary, and that's not even counting the
significant likely difference between whatever reader/writer lock it's
implemented internally, and the newer ReaderWriterLockSlim that's
available in .NET now. And it's almost always the case that code that is
designed to meet a multiple-reader, single-writer requirement (rather than
just having mutual exclusion all around) is expected to perform well and
be scalable.

Hashtable IS designed for multiple-reader, single-writer.
An explicitly-synchronized use of Dictionary using ReaderWriterLockSlim is
probably a better choice in that case.

(Of course, if Hashtable has been updated in the more recent versions of
.NET to use ReaderWriterLockSlim, then the above doesn't apply, but it
doesn't appear to have been :) ).

It doesn't appear to use ReaderWriterLockSlim, but it seems to be more like
software transactional memory than locking. Readers speculatively make a
local copy, then repeat if the version number changed during the copy. One
big difference vs locking is that in Hashtable, no reader can ever block the
writer.
 
B

Ben Voigt [C++ MVP]

Peter Duniho said:
I'm not sure why you felt a need to reiterate that. I, the OP, and even
you have already pointed that out. It makes me wonder if what I wrote was
really all that clear.

That was in reference to "the significant likely difference between whatever
reader/writer lock it's implemented internally, and the newer
ReaderWriterLockSlim that's available in .NET now". Since you seemed to be
saying that the "design for multiple-reader, single-writer" makes
ReaderWriterLockSlim better, as if Hashtable wasn't also designed for that.
Probably I just misunderstood what you intended to say.
 
P

Peter Morris

Hashtable IS designed for multiple-reader, single-writer.

Where is this implemented? For example if I look at the Values property
implementation

public virtual ICollection Values
{
get
{
if (this.values == null)
{
this.values = new ValueCollection(this);
}
return this.values;
}
}

There is no synchronisation code there at all, multiple reader threads could
easily access this at the same time and overwrite this.values.


Are you certain about your statement? If you are right then there is a
trick at work here that I am unaware of and would like to know about!
 
P

Peter Morris

Aha. It's not that Hashtable is thread safe, it's that the result from
Hashtable.Synchronized() is.

I'd still rather add the sync' code myself.
 
A

Arne Vajhøj

James said:
I read the following in the help for VS C# Express edition:

"Hashtable is thread safe for use by multiple reader threads and a
single writing thread."

if I am using a hashtable just like described above, multiple threads
read from in but only one thread changes it, should I still be declaring
it as 'volatile'?

The statement you quote only covers the safety of the Hashtable's
internal structure - it does not cover thread safety in general.

Volatile will make changes visible to other threads, but that
may not be sufficient for thread safety in your context.

Using lock on a shared object by all threads is most likely
to be most robust way of making your app thread safe.

Arne
 
A

Arne Vajhøj

Peter said:
Declaring it as volatile is only relevant if you do this

Hashtable X;

X = new Hashtable();
X = new Hashtable();
X = new Hashtable();

The reference is volatile, not the contents.

That is true.

But your point above it is false.

Volatile does not only effect access to the volatile variable - it
also prevents reads of other memory accesses to move before a volatile
read and after a volatile write.

So accessing X does imply some synchronization. Whether it solves
the original posters thread safety is another question.

Arne
 
A

Arne Vajhøj

Peter said:
You can't declare the Hashtable itself volatile. You can only declare a
variable referencing the Hashtable as volatile. And doing so only
affects the usage of that variable, not the Hashtable itself. I.e.
using "volatile" ensures that code that reads the variable (not the
Hashtable, but the reference to the Hashtable) will correctly see
changes made to the variable by code in another thread that writes to it
before the code that reads the variable is executed.

In other words, even if you had multiple writers to the Hashtable,
declaring a variable that references the Hashtable as volatile would
have zero effect on the thread-safety of the Hashtable itself.

Not so.

Access to any memory can not move before a volatile read or
after a volatile write, so the use of volatile do affect
more than just the variable itself.

ECMA-335 section 12.6.7:

A volatile read has “acquire semantics†meaning that the read is
guaranteed to occur prior to any references to
memory that occur after the read instruction in the CIL instruction
sequence. A volatile write has “release
semantics†meaning that the write is guaranteed to happen after any
memory references prior to the write
instruction in the CIL instruction sequence.

Whether it solves the original posters thread safety problem
is uncertain.

In general lock is much better.

Arne
 
B

Ben Voigt [C++ MVP]

Peter Morris said:
Where is this implemented? For example if I look at the Values property
implementation

public virtual ICollection Values
{
get
{
if (this.values == null)
{
this.values = new ValueCollection(this);
}
return this.values;
}
}

There is no synchronisation code there at all, multiple reader threads
could easily access this at the same time and overwrite this.values.

I hadn't been looking specifically at Values, but at the indexer.

Could there be a problem here? Yes, I think there's a bug in the code you
showed because if the writer thread added a value between the if condition
and the assignment to this.values you would lose the item added by the
writer (note that multiple readers is not a problem). A compare-and-swap
should have been used here, or even easier, just skip the assignment to
this.values and return the empty collection directly.
 
A

Arne Vajhøj

Peter said:
Not so. [...]

And yet, nothing in the remainder of your post contradicted anything I
wrote,

Try and read it again.

"You can only declare a variable referencing the Hashtable as volatile.
And doing so only affects the usage of that variable,"

"A volatile read has “acquire semantics†meaning that the read is
guaranteed to occur prior to any references to memory that occur after
the read instruction in the CIL instruction sequence."

"that variable" and "any references" is two very different semantics.

Arne
 
A

Arne Vajhøj

Peter said:
Peter said:
Not so. [...]
And yet, nothing in the remainder of your post contradicted anything
I wrote,

Try and read it again.

"You can only declare a variable referencing the Hashtable as volatile.
And doing so only affects the usage of that variable,"

_YOU_ "try and read it again".

You are reading out of context. Why, I have no idea. But, the point of
my post was to address the specific question, not to give a complete
enumeration of everything that "volatile" does. In context, my
statement was precise, to the point, and correct.

No - what you write is flat out wrong according to the ECMA
specification.

The Hashtable object is included in "any references to memory".

Arne
 
A

Arne Vajhøj

Peter said:
Either you are willfully choosing your own mistaken interpretation of
what I wrote, or you don't actually understand the text you quoted.
Based on your past "corrections", I'd put better odds on the former.
But feel free to admit the latter if you want to insist that you have
correctly read what I wrote.


But the guarantee is with respect to the variable (that is, the
"volatile read"), not those "references to memory". "volatile" isn't
affecting uses of the Hashtable object itself; it's affecting a read
from the volatile variable. Multiple threads using the Hashtable object
itself are not affected by the use of "volatile" on a variable that
references that Hashtable.

That is where you are wrong.

Declaring a variable volatile does not only effect access
to that variable.

It also effect access to all other places in memory, because
the .NET memory model limits what can be moved before and after
the access of the volatile variable.

That is what the quote from ECMA says.

Arne
 
A

Arne Vajhøj

Peter said:
[...]
Declaring a variable volatile does not only effect access
to that variable.

The word is "affect".
Yeah.
It also effect access to all other places in memory, because
the .NET memory model limits what can be moved before and after
the access of the volatile variable.

Perhaps you are getting hung up on my particular use of the word
"affect". The point is that making the reference to the Hashtable
instance volatile doesn't provide any useful guarantees about using the
Hashtable object itself.

It provides some guarantees about access to the the Hashtable.

The original poster did not provide enough to say whether it was
useful guarantees or not.
The "volatile" keyword necessarily has certain effects that may extend
past the variable itself, but the only guarantees provided are with
respect to how things are ordered with respect to that specific variable.

Yes.

But the affect on the volatile variable itself is also just
about ordering of read and writes.

Same type of impact.
If you declare one variable referencing the Hashtable as "volatile", but
then always use the Hashtable from some other variable, nothing happens
to the uses of the Hashtable. Even if you reference the Hashtable
through the "volatile" variable, there are no useful guarantees about
the data in the Hashtable itself.

Since your original post did not mention anything about useful, then
we seem to agree now that your original post was indeed flat wrong.

And the not useful part is still purely speculation.
If you want to go on nitpicking about things that are completely
irrelevant and orthogonal to the original question, please be my guest.

1) It makes sense to me to correct wrong statements even it is not
relevant for the original poster, because other people read
the thread and there are no reason why they should end up having the
same misunderstandings about what volatiles does and does not.

2) It could be very relevant for the original poster. I don't know.
You don't know. Because we do not have the context. It could be
critical for the thread safety of his code. Or it could be
irrelevant.
I'm done trying to explain to you why your claim that my post was
incorrect is itself incorrect.

I think we have proven that is was flat wrong.

Arne
 
B

Ben Voigt [C++ MVP]

No - what you write is flat out wrong according to the ECMA
specification.

The Hashtable object is included in "any references to memory".

Arne, Peter is right here.

All you've shown is that the Hashtable object is maintained coherent wrt
access to the variable. But access to the Hashtable object is NOT
maintained coherent wrt other access to the same Hashtable object. When
that Hashtable reference is read from the volatile variable, you get memory
fences. But that Hashtable reference can then be passed to a variable of
formal type IEnumerable, or even to the Hashtable's own member methods as
the "this" pointer. The object can then be used many times using the actual
parameter. Since the volatile variable isn't part of the access path
anymore, this subsequent access isn't synchronized and doesn't follow
volatile semantics.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top