non-blocking queue review

  • Thread starter William Stacey [MVP]
  • Start date
W

William Stacey [MVP]

Anyone care to comment on if this non-blocking queue implementation is
sound?

/// <summary>
/// Summary description for NonBlockingQueue.
/// Modeled after:
http://www.cs.rochester.edu/u/scott/synchronization/pseudocode/queues.html
/// </summary>
public class NonBlockingQueue
{
Queue Q;
int count;

public int Count
{
get { return count; }
}

internal class Node
{
public object value; // User object.
public object next; // Next Node.

public Node(object value, Node next)
{
this.value = value;
this.next = next;
}
}

internal class Queue
{
public object Head = null; // Dequeue from head.
public object Tail = null; // Enqueue to tail.
}

public NonBlockingQueue()
{
Q = new Queue();
Node node = new Node(null, null);
Q.Head = node;
Q.Tail = node;
}

public void Enqueue(object value)
{
object node = new Node(value, null);
object tail;
object next;
while(true)
{
tail = Q.Tail;
next = ((Node)tail).next;
if ( tail == Q.Tail ) // Does tail equal Queue Tail.
{
if ( next == null )
{
// Try to link node at the end of the linked list
if ( next == Interlocked.CompareExchange(ref ((Node)tail).next, node,
next) )
break; // enqueue is done; exit.
}
else // Tail was not pointing to null; Swing tail to next node.
Interlocked.CompareExchange(ref Q.Tail, next, tail);
}
} // End loop
// Enqueue is done. Try to swing Tail to the inserted node.
Interlocked.CompareExchange(ref Q.Tail, node, tail);
Interlocked.Increment(ref count);
}

public object Dequeue()
{
object value;
object head;
object tail;
object next;
while(true)
{
head = Q.Head;
tail = Q.Tail;
next = ((Node)head).next;
if ( head == Q.Head )
{
if ( head == tail ) // Is queue empty or Tail falling behind?
{
if ( next == null ) // Is queue empty?
return null; // Queue is empty.
Interlocked.CompareExchange(ref Q.Tail, next, tail);
}
else // No need to deal with Tail.
{
// Read user value before exchange.
// Otherwise, another dequeue might free the next node.
value = ((Node)next).value;
if (Interlocked.CompareExchange(ref Q.Head, next, head) == head)
{
Interlocked.Decrement(ref count);
return value;
}
}
}
} // End loop.
} // End Dequeue
} // End Class
 
J

Jon Skeet [C# MVP]

William Stacey said:
Anyone care to comment on if this non-blocking queue implementation is
sound?

Well, it's got some very odd code in it:
tail = Q.Tail;
next = ((Node)tail).next;
if ( tail == Q.Tail ) // Does tail equal Queue Tail.

Why would tail *not* equal Q.Tail at this stage? Presumably because
something else has done some queuing in the mean time?

Frankly, it all smacks of trying to be extremely clever so as not to
block at all - in my experience, code like that needs to be looked at
*very* carefully, preferrably by an expert in the particular memory
model of the platform you're using.

Do you really need this? As in, do you have evidence that a normal
"blocking" (very briefly!) queue is killing the performance in your
app?
 
W

William Stacey [MVP]

Well, it's got some very odd code in it:
Yes, it is based on one of Michael Scott's queues at:
http://www.cs.rochester.edu/u/scott/synchronization/
Under "Fast concurrent queue algorithms. We believe these algorithms to be
the best concurrent queues available,..."

Just trying to duplicate in c#.
Why would tail *not* equal Q.Tail at this stage? Presumably because
something else has done some queuing in the mean time?

I had same question Jon. I assume the same thing your second sentence talks
about.
Frankly, it all smacks of trying to be extremely clever so as not to
block at all - in my experience, code like that needs to be looked at
*very* carefully, preferrably by an expert in the particular memory
model of the platform you're using.

I would agree, hense the public post. Hopefully I can leverage some gray
matter out here.
Do you really need this? As in, do you have evidence that a normal
"blocking" (very briefly!) queue is killing the performance in your
app?

No, just doing as a mental challenge and learning exercize. I also posted a
TwoLockQueue that is very elegant and easy to visualize IMO, but still needs
to be verified by eye and heavy use. Even the "experts" are proved wrong on
their own algorithms over time, so I try to verify as much as possible.
 
J

Jon Skeet [C# MVP]

William Stacey said:
And this "Two-Lock Queue".

(Small point - you posted the entire code *after* your signature
section. For newsreaders which trim signatures automatically, that's a
pain.)

In the dequeuing code:

newHead = node.next; // Read next node.
if ( newHead == null ) // Is queue empty?
{
value = null; // Set out value object to null.
return false; // Queue was empty.
}

value = newHead.value; // Queue not empty. Read value before release.
Q.Head = newHead; // Swing Head to next node.

This has a "memory leak"-like problem, I believe.

The new head node still has a reference to the value which has just
been dequeued, which means that it won't be garbage collected.

I'm not sure what the benefit of this system where there's always a
sort of "dummy head element" is over just checking whether the head
node reference itself is null. It may be a bit more efficient in the
tail code - but if the cost is bugs, I'd rather take simplicity any day
:)

I'd also suggest making the nested classes private with internal
members rather than internal with public members. The constructor for
Node would need to be internal, of course.
 
J

Jon Skeet [C# MVP]

William Stacey said:
Yes, it is based on one of Michael Scott's queues at:
http://www.cs.rochester.edu/u/scott/synchronization/
Under "Fast concurrent queue algorithms. We believe these algorithms to be
the best concurrent queues available,..."

Just trying to duplicate in c#.


I had same question Jon. I assume the same thing your second sentence talks
about.

Right. Unfortunately, there's no guarantee (that I can see) that things
won't change immediately after the test. It's unclear what assumptions
the pseudo-code is making about the memory model. The comments are
unfortunately of the worst kind - they tell you what the code is doing
(which is usually obvious) rather than *why* it's doing it (which is
much more useful).
I would agree, hense the public post. Hopefully I can leverage some gray
matter out here.

I don't think there are many memory model experts on the group though,
I'm afraid. Valery Pryamikov has shown himself to know a heck of a lot
about concurrency in general, but it's really the .NET memory model
details which need to be analysed here.

The fact that I usually end up being the one talking about memory model
stuff here shows the low level of expertise on the group, unfortunately
:( (I'm a hell of a long way from being an expert on it.)
No, just doing as a mental challenge and learning exercize. I also posted a
TwoLockQueue that is very elegant and easy to visualize IMO, but still needs
to be verified by eye and heavy use. Even the "experts" are proved wrong on
their own algorithms over time, so I try to verify as much as possible.

I would say the singleton article we've been commenting on in terms of
memory barriers is a good indication that if there's any debate, just
don't use it!
 
W

William Stacey [MVP]

(Small point - you posted the entire code *after* your signature
section. For newsreaders which trim signatures automatically, that's a
pain.)

I posed as an attachment and I see it a txt attachment. What ng reader do
you use?
This has a "memory leak"-like problem, I believe.

The new head node still has a reference to the value which has just
been dequeued, which means that it won't be garbage collected.

Yes I suppose:
value = newHead.value;
newHead.value = null;

May be better. However I would think it would be GCd after user releases
value and after head is moved to next node. If head is never moved, then
that one object would live on.
I'm not sure what the benefit of this system where there's always a
sort of "dummy head element" is over just checking whether the head
node reference itself is null. It may be a bit more efficient in the
tail code - but if the cost is bugs, I'd rather take simplicity any day

Yes. I "think" it has to do with concurrency when queue is empty. I still
need to draw this out and walk it very slowly.
I'd also suggest making the nested classes private with internal
members rather than internal with public members. The constructor for
Node would need to be internal, of course.

Agreed. Other ideas are welcome. Cheers!
 
J

Jon Skeet [C# MVP]

William Stacey said:
I posed as an attachment and I see it a txt attachment. What ng reader do
you use?

Gravity. I saw it "inline" but immediately after your text. The first
article didn't have the same problem. In general, I don't think it's
worth adding text files as attachments - just paste them in.
Yes I suppose:
value = newHead.value;
newHead.value = null;

May be better.
Yup.

However I would think it would be GCd after user releases
value and after head is moved to next node. If head is never moved, then
that one object would live on.
Yes.


Yes. I "think" it has to do with concurrency when queue is empty. I still
need to draw this out and walk it very slowly.

Certainly when the queue is empty, you'd need to modify both head *and*
tail, which means having both locks (in the right order, of course).
Agreed. Other ideas are welcome. Cheers!

One thing - I suspect you need a call to Interlocked.(Something) to
return Count accurately, too.
 
W

William Stacey [MVP]

Should have posted this. I think this is the latest paper that relates
these queues.
http://citeseer.ist.psu.edu/cache/p...rent_queue_algorithms.pdf/michael96simple.pdf
Right. Unfortunately, there's no guarantee (that I can see) that things
won't change immediately after the test. It's unclear what assumptions
the pseudo-code is making about the memory model. The comments are
unfortunately of the worst kind - they tell you what the code is doing

As they use c, I don't think they make any assumptions of memory model. So
if you program it for the worst case model, then it should work for .net I
would think?
I don't think there are many memory model experts on the group though,
I'm afraid. Valery Pryamikov has shown himself to know a heck of a lot
about concurrency in general, but it's really the .NET memory model
details which need to be analysed here.

Yes. I see you and I have been busy at different places talking about
barriers and Memory models. One of the reasons I did these queues. If we
could prove these out, then we could have a baseline to build on and refer
to (of what to do or what not to do.) Simple is better in general, but when
you have deeper discussions, it would good to have something working such as
the no-lock queue to talk from.
I would say the singleton article we've been commenting on in terms of
memory barriers is a good indication that if there's any debate, just
don't use it!

Agree. But want to get these two working 100% to demonstrate the hoops you
need to jump around when you don't do the simple route. Cheers!
 
W

William Stacey [MVP]

Gravity. I saw it "inline" but immediately after your text. The first
article didn't have the same problem. In general, I don't think it's
worth adding text files as attachments - just paste them in.

The first one I posed inline, the second as attachment. Sometime the line
length breaks and makes it hard to cut and paste into VS, so you need to fix
it up. When I see code, I would rather cut it from txt file, but this is
personal pref.
One thing - I suspect you need a call to Interlocked.(Something) to
return Count accurately, too.

Good question Jon. I had the same one. I "think" you can safely *read a
var without issue if you use Interlocked to increment and decrement that var
as Interlocked is atomic, the read could never read a bad or incomplete
value (the MS docs show this as well, but they could be wrong too.)
Naturally, someone out there will prove this wrong, so I welcome the info.
Cheers!
 
J

Jon Skeet [C# MVP]

William Stacey said:
The first one I posed inline, the second as attachment. Sometime the line
length breaks and makes it hard to cut and paste into VS, so you need to fix
it up. When I see code, I would rather cut it from txt file, but this is
personal pref.

I just prefer being able to cut and paste rather than going to a menu
to decode, then opening it, then cutting and pasting etc.
Good question Jon. I had the same one. I "think" you can safely *read a
var without issue if you use Interlocked to increment and decrement that var
as Interlocked is atomic, the read could never read a bad or incomplete
value (the MS docs show this as well, but they could be wrong too.)
Naturally, someone out there will prove this wrong, so I welcome the info.

I don't think there's any such guarantee, to be honest - otherwise read
memory barriers would be pointless...
 
J

Jon Skeet [C# MVP]

William Stacey said:
Should have posted this. I think this is the latest paper that relates
these queues.
http://citeseer.ist.psu.edu/cache/papers/cs/268/ftp:zSzzSzftp.cs.roch
ester.eduzSzpubzSzpaperszSzsystemszSz96.PODC.Non-blocking_and_blocking
_concurrent_queue_algorithms.pdf/michael96simple.pdf


As they use c, I don't think they make any assumptions of memory model. So
if you program it for the worst case model, then it should work for .net I
would think?

I'm not entirely convinced. For instance, many people cite the double-
checked locking algorithm as though it'll just work - but that *very*
much depends on the memory model being used. I'll read the paper and
see if I can understand it though...
Yes. I see you and I have been busy at different places talking about
barriers and Memory models. One of the reasons I did these queues. If we
could prove these out, then we could have a baseline to build on and refer
to (of what to do or what not to do.) Simple is better in general, but when
you have deeper discussions, it would good to have something working such as
the no-lock queue to talk from.

It would certainly be interesting to see such a thing - but I'm not at
all convinced by this implementation, I'm afraid.
Agree. But want to get these two working 100% to demonstrate the hoops you
need to jump around when you don't do the simple route. Cheers!

No problems.
 
W

William Stacey [MVP]

I don't think there's any such guarantee, to be honest - otherwise read
memory barriers would be pointless...

Not sure. I guess better to be safe.
 
W

William Stacey [MVP]

It would certainly be interesting to see such a thing - but I'm not at
all convinced by this implementation, I'm afraid.

I believe in the general idea that using Interlocked can and does work. And
I ~think their implimentation does work. I question my implimentation in
c#, but feel this is probably pretty close. I sent him an email to
hopefully spark interest in seeing his algorithum implemented in c#. Maybe
he will give to a grad student or something to prove out. I will post any
reply. Cheers!
 
W

William Stacey [MVP]

I don't think there's any such guarantee, to be honest - otherwise read
memory barriers would be pointless...

I guess that is the question of the week. Wonder if following answers this
question or not?

http://msdn.microsoft.com/library/d...synchronization_and_multiprocessor_issues.asp

"Processors can be instructed to force their memory caches to agree with
main memory with special instructions. Such instructions ensure that
previous read and write requests have completed and are made visible to
other processors, and to ensure that that no subsequent read or write
requests have started. Examples are:

Functions which enter or leave critical sections.
Functions which signal synchronization objects.
Wait functions.
Interlocked functions
Consequently, the multiprocessor race condition above can be repaired as
follows:

BOOL volatile fValueHasBeenComputed = FALSE;

void CacheComputedValue()
{
if (!fValueHasBeenComputed)
{
iValue = ComputeValue();
InterlockedExchange((LONG*)&fValueHasBeenComputed, TRUE);
}
}
The InterlockedExchange function ensures that the value of iValue is updated
for all processors before the value of fValueHasBeenComputed is set to
TRUE."
 
J

Jon Skeet [C# MVP]

William Stacey said:
I guess that is the question of the week. Wonder if following answers this
question or not?

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dllp
roc/base/synchronization_and_multiprocessor_issues.asp

"Processors can be instructed to force their memory caches to agree with
main memory with special instructions. Such instructions ensure that
previous read and write requests have completed and are made visible to
other processors, and to ensure that that no subsequent read or write
requests have started. Examples are:

Functions which enter or leave critical sections.
Functions which signal synchronization objects.
Wait functions.
Interlocked functions

Aha. In that case I guess it probably is okay - that bit of the code,
anyway. I'll have a look at the rest with a fine tooth comb when I get
the chance (when I've finished my own threading article, for a start!)
 
W

William Stacey [MVP]

Updated two-lock queue. Until someone can help with a no-lock queue in c#,
I would go with this one. On a multi-cpu box, read and write can happen at
same time (hence the two-locks).

/// <summary>
/// Two-Lock Concurrent Queue Algorithm
///
http://www.cs.rochester.edu/u/scott/synchronization/pseudocode/queues.html
/// </summary>
public class TwoLockQueue
{
private Node head;
private Node tail;
private readonly object headLock;
private readonly object tailLock;
private int count;

public TwoLockQueue()
{
headLock = new object();
tailLock = new object();
Node node = new Node(null);
head = tail = node;
count = 0;
}

public int Count
{
get { return count; }
}

public ArrayList Values
{
get
{
lock(headLock)
{
lock(tailLock)
{
ArrayList values = new ArrayList();
Node node = head.next;
while( node != null )
{
values.Add(node.value);
node = node.next;
}
return values;
}
}
}
}

private class Node
{
internal object value;
internal Node next;

internal Node(object value)
{
this.value = value;
next = null;
}
}


public void Enqueue(object value)
{
Node node = new Node(value);
lock(tailLock)
{
tail.next = node; // Link node at the end of the linked list.
tail = node; // Swing Tail to new node.
Interlocked.Increment(ref count);
}
}

public object Dequeue()
{
object value = null;
lock(headLock)
{
Node first = head.next; // Get pointer to first object or null if empty.
if ( first != null ) // Have object to dequeue?
{
value = first.value; // We have an object.
head = first; // Set head to next object.
head.value = null; // Clear value so we don't hold a reference.
Interlocked.Decrement(ref count);
}
return value;
}
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top