Scope of IEnumerable object within foreach

B

bughunter

This is partly 'for the record' and partly a query about whether the
following is a bug somewhere in .Net (whether it be the CLR, JITter, C#
compiler). This is all in the context of .Net 1.1 SP1.


Recently we (my fellow team members & I) observed an
InvalidOperationException - 'collection has been modified', however it
wasn't immediately obvious why this would be occuring. There are no
modifications occuring within the loop and there is not another thread
that could be making a change.

The object being looped over was derived from CollectionBase and the
culprit turned out to be a finalizer on the class, a finalizer that
called Clear() thus modifying the collection. My question then is, why
did the finalizer get called while the CollectionBase object was still
in scope? Or more to the point, the object has clearly gone out of
scope and so should it have done?

Here's some code to demonstrate the issue fairly reliably (on my PC)...

---------------------------------------------
// Our problematic CollectionBase class.
public class TestCollection : CollectionBase
{
public void Add(object o)
{
List.Add(o);
}

~TestCollection()
{
this.Clear();
}
}

// A helper class. This has no real relevance to the problem.
public class ListItem
{
private string itemCode;
private string itemDescription;

public ListItem(string itemCode, string itemDescription)
{
this.itemCode = itemCode;
this.itemDescription = itemDescription;
}

public override string ToString()
{
return itemDescription;
}
}
---------------------------------------------



If you then create a form in Visual Studio, drop a ComboBox and a
button on there and add the following code:



--------------------------------------------
// Creates an instance of TestCollection with some ListItems(see above)
in it.
private TestCollection BuildCol(int size)
{
TestCollection oList = new TestCollection();

for(int i=0; i<size; i++)
{
string s = i.ToString();
oList.Add(new ListItem(s,s));
}

return oList;
}

// Populates our combobox.
private void PopulateCombobox()
{
TestCollection col = BuildCol(1000);

foreach(object o in col) // exception thrown here!
{
comboBox1.Items.Add(o);

}

comboBox1.Sorted = true;
}

// From the button click event we populate the combobox several times
over to
// increase the chances of reproducing the problem.
private void button1_Click(object sender, System.EventArgs e)
{
for(int i=0; i<100; i++)
{
PopulateCombobox();
comboBox1.Items.Clear();
}
}
--------------------------------------------


Running this code generates an exception on every clicks of the button
here. If you don;t get this then you can add a call to GC.Collect()
within the foreach loop to cause a garbage collection which in turn
will invoke the TestCollection's finalizer.

What I think is happening here is that foreach is just shorthand for
something like:

--------------
IEnumerator e = col.GetEnumerator();
while(e.Current!=null)
{
// Do stuff here.

e.MoveNext();
}
--------------

So although technically 'col' is in scope for the lifetime of the loop,
in reality a code optimizer may be flagging it as out of scope since it
is not actually being used within the loop. If a garbage collection
happens to occur then the finalizer is called, modifying the collection
and bang! Of course the IEnumerator would normally have a reference to
the collection so this really doesn't seem right to me.

Any thoughts?

Colin Green
 
J

Jon Skeet [C# MVP]

This is partly 'for the record' and partly a query about whether the
following is a bug somewhere in .Net (whether it be the CLR, JITter, C#
compiler). This is all in the context of .Net 1.1 SP1.


Recently we (my fellow team members & I) observed an
InvalidOperationException - 'collection has been modified', however it
wasn't immediately obvious why this would be occuring. There are no
modifications occuring within the loop and there is not another thread
that could be making a change.

The object being looped over was derived from CollectionBase and the
culprit turned out to be a finalizer on the class, a finalizer that
called Clear() thus modifying the collection.

Did you really need the finalizer in the first place? Very, very few
classes really need finalizers.
My question then is, why
did the finalizer get called while the CollectionBase object was still
in scope? Or more to the point, the object has clearly gone out of
scope and so should it have done?

Let's be clear about things:
1) Objects don't have scope
2) Variables have scope
3) Scope *in itself* doesn't entirely govern whether or not a variable
prevents the object it refers to from being garbage collected

<snip>

Variables don't prevent an object from being garbage collected when the
JIT can tell that the variable's value isn't used again. For instance,
if you do:

object o = new object();

Thread.Sleep(1000); // During this sleep, the first object is eligible
// for garbage collection

o = new object();
What I think is happening here is that foreach is just shorthand for
something like:

--------------
IEnumerator e = col.GetEnumerator();
while(e.Current!=null)
{
// Do stuff here.

e.MoveNext();
}
--------------
Indeed.

So although technically 'col' is in scope for the lifetime of the loop,
in reality a code optimizer may be flagging it as out of scope since it
is not actually being used within the loop. If a garbage collection
happens to occur then the finalizer is called, modifying the collection
and bang! Of course the IEnumerator would normally have a reference to
the collection so this really doesn't seem right to me.

In the case of CollectionBase, however, the enumerator doesn't need to
have a reference to the CollectionBase itself - just the "inner list"
it contains.

So, your CollectionBase was genuinely being finalized, although the
"inner list" it used was still in use. Unfortunately, your rogue
finalizer cleared the list, hence the exception you saw.
 
B

bughunter

Hi Jon,

Thanks for the clarification. There was an air of mystery surrounding
this bug for a while so it's nice to fully understand what is going on.
It's worth noting that at least one other person here working on a
seperate project had also come across the same issue and had just
marked it as a bug in dotnet, using workarounds to solve the problem
such as lock statements, GC.KeepAlive() and avoiding the foreach
statement. This makes me wonder if there are others out there that have
made the same assumption and who now just avoid using foreach.

Cheers,

Colin Green
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top