yield return + finally unreliable?

M

Marcel Müller

I have a function that dispatches I/O Tasks while operation on transient
data. There is a deadlock from time to time (about every 3 weeks in
production environment).

Analysis of (forced) crash dumps show that many threads are waiting for
a peding object to be read but no one seems to load the data from the
backend.

Rough pattern:

public IEnumerable<E> GetObjects<E>(IEnumerable<Ref<E>> orefs)
where E : StrongEntityBase<E>

E is some object of an arbitrary entity while Ref<E> is only a
lightwight reference. The function retrieves the requested objects.

Implementation:

var notcached = new List<E>(READPACKAGESIZE);
var waitworker = new WaitForObjectState<E>();
try
{ // retrieve all objects from the cache or create new ones in the cache.
// The new ones are identified by their state and need to be loaded
// before they are returned.
// The call does not block on I/O.
var source = StrongEntityBase<E>.CacheWorker.LookupOrCreate(orefs);

foreach (E obj in source)
{retry:
switch (obj.ObjectState)
{case ObjectState.Virgin: // new object
//
if (!obj.SetPendingState())
goto retry; // another thread already set the state to Pending
notcached.Add(obj);

if (notcached.Count >= READPACKAGESIZE)
{ // Read data from backend. This function blocks.
DoReadPackage(notcached);
foreach (E obj2 in notcached)
yield return obj2;
notcached.Clear();
}
break;

case ObjectState.Pending: // in work by another thread
// wait on obj for the next state change.
waitworker.WaitForObject(obj);
goto retry;

case ObjectState.Valid: // Cached
// cache hit
yield return obj;
}
}

// return remaining objects in last package
// Read data from backend. This function blocks.
DoReadPackage<E>(notcached);
foreach (E obj2 in notcached)
yield return obj2;
notcached.Clear();
}
finally
{ waitworker.Dispose();
// read last package in case of incomplete enumeration because
// another thread could wait on that data.

// in case of exceptions some objects could already be loaded.
notcached.RemoveAll(obj => obj.ObjectState != ObjectState.Pending);

// Read the remaining objects.
DoReadPackage<E>(notcached);
}


The critical point is in my opinion the finally block. If it is not
executed then objects might be left in the cache in state Pending
without any thread going to read them.

So the question is whether finally in a function with yield return is
reliable?
What happens if some code misses to dispose the enumerator returned by
the class generated from yield return?

I do not think that there is a race condition while accessing
ObjectState. ObjectState is only changed by Interlocked.CompareExchange
and it follows a directed graph.
All objects (and their children) are immutable as soon as they are in
state Valid. Before state Valid they are owned by the thread that
changed the state to Pending. This thread is responsible for turning the
state to Valid. The above code should ensure that.
The function DoReadPackage never returns until all objects are in state
Valid. This is proven.


Marcel
 
M

Marcel Mueller

Problem confirmed!

The following code fails to walk through the finally block in GetStrings
in the last test case.

class Program
{
static void Main(string[] args)
{
using (var en = GetStrings("with dispose").GetEnumerator())
{ en.MoveNext();
Console.WriteLine("got "+en.Current);
}

try
{ foreach (var s in GetStrings("foreach with exception"))
{ Console.WriteLine("got "+s);
throw new Exception("Test");
}
} catch (Exception ex)
{ Console.WriteLine("cought "+ex.Message);
}

{ var en = GetStrings("w/o dispose").GetEnumerator();
en.MoveNext();
Console.WriteLine("got "+en.Current);
// Missing Dispose
en = null;
GC.Collect();
Console.WriteLine("Collected");
}
}

static IEnumerable<string> GetStrings(string name)
{ Console.WriteLine("\nGetStrings "+name);
try
{ yield return "Erwin";
yield return "Kurt";
} finally
{ Console.WriteLine("Finally");
}
}
}


I think the problem is that the finally block in a yield return function
is similar to an unmanaged resource. It must be covered not only by
Dispose but also by the finalizer. But the Class generated by GetStrings
implements IDisposable but does not implement a finalizer.

The bad news is that this problem tends to create cascades. I.e. if you
use foreach or using inside the yield return function, their objects may
not be cleaned up correctly.

I see why there is no finalizer. Most likely this is for performance
reasons. But as long as Dispose calls SupressFinalize the penalty is
moderate. So either finally should be implemented correctly or it should
not be supported together with yield return, because it creates hard to
find bugs. In my case the entire Web server ended up in a deadlock. And
dealing with the dump file of the in memory database was not too much fun.


Marcel
 
A

Arne Vajhøj

Problem confirmed!

The following code fails to walk through the finally block in GetStrings
in the last test case.

class Program
{
static void Main(string[] args)
{
using (var en = GetStrings("with dispose").GetEnumerator())
{ en.MoveNext();
Console.WriteLine("got "+en.Current);
}

try
{ foreach (var s in GetStrings("foreach with exception"))
{ Console.WriteLine("got "+s);
throw new Exception("Test");
}
} catch (Exception ex)
{ Console.WriteLine("cought "+ex.Message);
}

{ var en = GetStrings("w/o dispose").GetEnumerator();
en.MoveNext();
Console.WriteLine("got "+en.Current);
// Missing Dispose
en = null;
GC.Collect();
Console.WriteLine("Collected");
}
}

static IEnumerable<string> GetStrings(string name)
{ Console.WriteLine("\nGetStrings "+name);
try
{ yield return "Erwin";
yield return "Kurt";
} finally
{ Console.WriteLine("Finally");
}
}
}


I think the problem is that the finally block in a yield return function
is similar to an unmanaged resource. It must be covered not only by
Dispose but also by the finalizer. But the Class generated by GetStrings
implements IDisposable but does not implement a finalizer.

The bad news is that this problem tends to create cascades. I.e. if you
use foreach or using inside the yield return function, their objects may
not be cleaned up correctly.

I see why there is no finalizer. Most likely this is for performance
reasons. But as long as Dispose calls SupressFinalize the penalty is
moderate. So either finally should be implemented correctly or it should
not be supported together with yield return, because it creates hard to
find bugs. In my case the entire Web server ended up in a deadlock. And
dealing with the dump file of the in memory database was not too much fun.

I am not sure why you are surprised.

It is documented in the C# specs that:
* yield return never executes finally
* yield break executes finally
* exception inside executes finally
* Dispose executes finally
* using always calls Dispose
* foreach always calls Dispose

There are no reasons to expect that finally would execute in
any other cases.

In general having an object that implements IDisposable and
not calling Dispose is bad.

I am not surprised that the last pattern created problems.

This guy:

http://blogs.msdn.com/b/dancre/arch...nd-usings-your-dispose-may-not-be-called.aspx
was somewhat surprised but seems to buy the argument that it
is not a bug.

Arne
 
S

Sun Tiantang

Problem confirmed!

The following code fails to walk through the finally block in GetStrings
in the last test case.
class Program

static void Main(string[] args)

using (var en = GetStrings("with dispose").GetEnumerator())
{ en.MoveNext();
Console.WriteLine("got "+en.Current);



{ foreach (var s in GetStrings("foreach with exception"))
{ Console.WriteLine("got "+s);
throw new Exception("Test");

} catch (Exception ex)
{ Console.WriteLine("cought "+ex.Message);


{ var en = GetStrings("w/o dispose").GetEnumerator();

Console.WriteLine("got "+en.Current);
// Missing Dispose
en = null;





static IEnumerable<string> GetStrings(string name)
{ Console.WriteLine("\nGetStrings "+name);

{ yield return "Erwin";
yield return "Kurt";
} finally
{ Console.WriteLine("Finally");





I think the problem is that the finally block in a yield return function
is similar to an unmanaged resource. It must be covered not only by
Dispose but also by the finalizer. But the Class generated by GetStrings
implements IDisposable but does not implement a finalizer.

The bad news is that this problem tends to create cascades. I.e. if you
use foreach or using inside the yield return function, their objects may
not be cleaned up correctly.

I see why there is no finalizer. Most likely this is for performance
reasons. But as long as Dispose calls SupressFinalize the penalty is
moderate. So either finally should be implemented correctly or it should
not be supported together with yield return, because it creates hard to
find bugs. In my case the entire Web server ended up in a deadlock. And
dealing with the dump file of the in memory database was not too much fun.



I am not sure why you are surprised.



It is documented in the C# specs that:

* yield return never executes finally

* yield break executes finally

* exception inside executes finally

* Dispose executes finally

* using always calls Dispose

* foreach always calls Dispose



There are no reasons to expect that finally would execute in

any other cases.



In general having an object that implements IDisposable and

not calling Dispose is bad.



I am not surprised that the last pattern created problems.



This guy:



http://blogs.msdn.com/b/dancre/arch...nd-usings-your-dispose-may-not-be-called.aspx

was somewhat surprised but seems to buy the argument that it

is not a bug.



Arne


I find this post very interesting but one thing which still confuses me is:
* using always calls Dispose

I thought this is always the case but the blog link you gave showed that using does NOT always call Dispose hence he had the bug which he blogged. Canyou explain this further please?
Many thanks.
Tiantang
 
A

Arne Vajhøj

I find this post very interesting but one thing which still confuses me is:
* using always calls Dispose

I thought this is always the case but the blog link you gave showed that using does NOT always call Dispose
hence he had the bug which he blogged. Can you explain this further
please?

I can understand you getting confused by the seemingly discrepancy.

The original problem and my reply are about when using is applied
to the enumerator. The link is about when using is applied inside
the yielding method. So using in the link has the same role as
finally in the original problem. And it does not get executed
because the method is never "completed" and the called code
has no way of knowing that the calling code will not "complete"
the iteration.

Sorry for the confusion.

Arne
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top