Framework Bug: KeyedCollection<T>

  • Thread starter Chris Mullins [MVP]
  • Start date
C

Chris Mullins [MVP]

KeyedCollection is a very handy little class, that unforutnatly has a nasty
bug in it.

The bug (which I ran across) causes the following code to fail:

if (!rooms.Contains(room))
_rooms.Add(room);

The problem is that contains returns "false", but then Add throws an
exception because the item really is already in there.

The follow code illustrates the bug in the class:

static void Main(string[] args)
{
KeyTest foo = new KeyTest();
KeyedClass item1 = new KeyedClass("1", "99");
KeyedClass item2 = new KeyedClass("1", "99");
KeyedClass item3 = new KeyedClass("1", "100");

foo.Add(item1);

// this one fails
if (!foo.Contains(item2))
Console.WriteLine("Broken Based on Instancing!");

// this one fails
if (!foo.Contains(item3))
Console.WriteLine("Broken Based on Key Lookup!");

// this one works
if (!foo.Contains(item2.Key))
Console.WriteLine("Even Contains by Key Type is Broken");
}

class KeyTest : KeyedCollection<string, KeyedClass>
{
protected override string GetKeyForItem(KeyedClass item)
{
return item.Key;
}
}

class KeyedClass
{
public readonly string Key, Value;
public KeyedClass(string k, string v)
{
Key = k; Value = v;
}
}
 
C

Chris Mullins [MVP]

I should add that there's a straightforward workaround. Changing the
concrete KeyedCollection class to override the base Contains method works
pretty well. Unfortunatly, very few people are going to think to do this
every time they create a keyed collection class...

class KeyTest : KeyedCollection<string, KeyedClass>
{
protected override string GetKeyForItem(KeyedClass item)
{
return item.Key;
}

public new bool Contains(KeyedClass item)
{
return this.Contains(item.Key);
}
}
 
J

Jon Skeet [C# MVP]

Chris Mullins said:
KeyedCollection is a very handy little class, that unforutnatly has a nasty
bug in it.

Hmm... I'm not sure.
The bug (which I ran across) causes the following code to fail:

if (!rooms.Contains(room))
_rooms.Add(room);

The problem is that contains returns "false", but then Add throws an
exception because the item really is already in there.

Indeed.

The Contains method is meant to check whether the given *key* is in the
collection. The docs say that GetKeyForItem is used to look up the key
for each element, but there's nothing that state it's used to get the
key for the argument you pass into Contains - that's meant to already
be a key. In your case, it isn't a key, it's an item.

On the other hand, I'm surprised this compiles, given that the type you
pass to Contains is meant to be of type TKey, and in your sample code
you're passing in an instance of TItem.

Hmm. Will look closer when I have more time.
 
M

Mattias Sjögren

Chris,
KeyedCollection is a very handy little class, that unforutnatly has a nasty
bug in it.

I don't think it's a bug. You have to override Equals in the
KeyedClass or provide a custom IEqualityComparer to the
KeyedCollection for Contains to work as expected.

Note that the first two calls to Contains end up calling
Collection<T>.Contains in the base class, which "determines whether an
element is in the Collection" (i.e. requires equality comparison). The
last call on the other hand is to KeyedCollection<TKey,
TItem>.Contains which "determines whether the collection contains an
element with the specified key".


Mattias
 
C

Chris Mullins [MVP]

On the other hand, I'm surprised this compiles, given that the type you
pass to Contains is meant to be of type TKey, and in your sample code
you're passing in an instance of TItem.

That's acutally the underlying problem - The Contains<TItem> is a public
method provided by the base Collection<T> class. The Contains<TKey> method
is provided by the KeyedCollection class.

This means there are two overloads in my sample:
Contains(string key)
Contains(KeyedClass item)

If I call the Contains(string) method, everything is fine - the keyed
collection does it's thing and it works properly. If I called the
Contains(KeyedClass) method, then the Collection<T> class is called
(bypassing all the checked in the KeyedCollection class) and ends up doing a
List.Contains, which is the wrong thing altogether.
 
C

Chris Mullins [MVP]

I'm clear on the behavior as to what's really going on, but once I figured
out it works this way, I found buggy code all over the place.

In many spots, code written by pretty sharp folks was calling
if (items.Containts(Item)) { do something }

.... and it wasn't doing what they thought it would.

Based on "What should this do?", code that looks like (and common variations
on):
if (!rooms.Contains(room))
_rooms.Add(room);

.... really needs to work "right", and not throw an exception.

I realize overriding equals would do the trick here, but for a class called
"KeyedCollection", only the Key should need the equals operator.
 
J

Jon Skeet [C# MVP]

Chris Mullins said:
That's acutally the underlying problem - The Contains<TItem> is a public
method provided by the base Collection<T> class. The Contains<TKey> method
is provided by the KeyedCollection class.

Right. Fair enough.
This means there are two overloads in my sample:
Contains(string key)
Contains(KeyedClass item)

If I call the Contains(string) method, everything is fine - the keyed
collection does it's thing and it works properly. If I called the
Contains(KeyedClass) method, then the Collection<T> class is called
(bypassing all the checked in the KeyedCollection class) and ends up doing a
List.Contains, which is the wrong thing altogether.

And once more, ill-thought-out inheritance strikes :(
 
M

MagicBox

Jon Skeet said:
Right. Fair enough.


And once more, ill-thought-out inheritance strikes :(

Why would that be? Depending on the context of your quest for existance it's
good that both exist.
If you have two collections between which you move items based on user input
or whatever you want to do, you may want to check for existence of the item
reference rather than by key. Although checking for existence by key would
probably faster since the internal dictionairy then would be used. Though if
there are not enough items in the collection to have enabled the internal
dictionary, searching by reference would be faster.
 
J

Jon Skeet [C# MVP]

Why would that be? Depending on the context of your quest for existance it's
good that both exist.

Possibly - but the fact that they're both called the same thing makes
it confusing. Heck, if it can confuse Chris M, it can confuse anyone.

Chris's complaint that:

if (!rooms.Contains(room))
rooms.Add(room);

can break is a perfectly valid one, IMO. The type isn't clear whether
it views the collection as one of keys, or one of values, effectively.
 
C

Chris Mullins [MVP]

Why would that be? Depending on the context of your quest for existance
it's good that both exist.

The problem is that you have two overloads:
..Contains<T>(T)
..Contains(TKey)(TKey)

These two overloads have completly different behaviors. This runs contrary
to all expected behavior of overloaded methods.

In this case, it's simply an oversite by the Framework team. They forgot to
either:

1) Depricate the inherited Contains<T> method that coming from List<T>
2) Create their own Contains<T> method that calls Contains<TKey>
 
Top