I'd probably just leave it at that. Calling GetHashCode on age wouldn't
have don't anything, as Int32's implementation of GetHashCode is just
to return the original value.
Yeah, after realizing Int32.GetHashCode just returns the same value
I'd leave it as-is as well. It's too much effort trying to think of a
better distribution especially considering that people with same name
(first and last anyway) doesn't occur frequently enough.
Ya know, my naive line of thinking was that Int32.GetHashCode would
return something other than the original value so that the
distribution would be completely random. As it is now the
distribution can definitely be considered good since the hash code has
the same entropy as the original value, but it's not at all random.
To be perfectly honest, I'm not sure how important it is for the hash
code to be absolutely random as opposed to just good.
I'd hope that a good hashtable implementation would effectively ignore
the "closeness" of hashcodes to some extent, for instance by bucketing
with some appropriate mod function.
The documentation does mention that for best performance the
distribution of GetHashCode should be random. But, I'm sure that
Hashtable and Dictionary both have good implementations and that the
clustering around the name in your example is probably negligible. I
brought it up because I thought it would stimulate some interesting
discussions at the very least.
Several years ago I looked at the shared source code for Hashtable and
I believe it was using the guadratic probing method. It's been awhile
since my data structures class so I'm a little rusty, but I believe
that method does offer some protection from the clustering problem.
Anyone know what method Dictionary uses? I thought it was different
from Hashtable.
Agreed. Ironically I was thinking about blogging precisely this point,
although for slightly different reasons. Many types - most types, even
- aren't really suitable as hash keys unless you're actually talking
about identity. That can still be useful, but it's probably worth
knowing what you're talking about.
So long as you made an implementation of IEqualityComparer available
which took the "native" implementation of the current
object.GetHashCode available to cope with the identity part, I think it
would be entirely reasonable to not have GetHashCode and Equals as part
of object itself. I think of it as a mistake that Java made and then
.NET copied.
(There may have been platforms before Java which had a single type
hierarchy where the uber-type had hashcode and equality defined - I
just don't know of it.)
--
I think that would be an interesting topic for a blog. I do try to
read your blog posts so I'll catch it if you do decide to post on the
topic.