HashTable stores whole object as Key NOT the result of calling GetHashCode on the object when you do

J

John Davis

Hi All

I was testing my understanding of Hashtables and I thought that if I
override GetHashCode then i could add an object to hashTable like so:

Dim ht As New Hashtable
ht.Add(s1, s1)
ht.Add(s2, s2)
ht.Add(s3, s3)

Where s* are instance of a student class that overrides GetHashCode to
return the StudentID (code at end of post).

I shouldn't add it like:

ht.Add(s1.GetHashCode(), s1)

as the HashTable calls GetHashCode on the Key for me.

This should then store the Hash (The ID returned from GHC) as the Key and
the student as the object. However if you loop through the HashTable with:

Dim de As DictionaryEntry
For Each de In ht
Console.WriteLine("Key = {0}, Value = {1}", de.Key, de.Value)
Next

It shows that the student is stored as both the Key and the Value. Does this
mean that you should just add objects to the HashTable like:

Dim ht As New Hashtable
ht.Add(s1, Nothing)
ht.Add(s2, Nothing)
ht.Add(s3, Nothing)

Seems to me I have misunderstood something here and thw whole point of a
HashTable is to use a key to get to a value, was hoping somebody could clear
this up for me.

Cheers

John

Student Code

Class Student
Private _Name As String
Private _ID As Integer
Private _Age As Integer

Public Sub New(ByVal Name As String, ByVal ID As Integer, ByVal age As
Integer)
_Name = Name
_ID = ID
_Age = age
End Sub

Public Overrides Function ToString() As String
Return String.Format("{0} is {1} years old and is student no: {2}",
_Name, _Age, _ID)
End Function

Public Overrides Function GetHashCode() As Integer
Return _ID
End Function

End Class
 
D

Damien

John said:
Hi All

I was testing my understanding of Hashtables and I thought that if I
override GetHashCode then i could add an object to hashTable like so:

Dim ht As New Hashtable
ht.Add(s1, s1)
ht.Add(s2, s2)
ht.Add(s3, s3)

Where s* are instance of a student class that overrides GetHashCode to
return the StudentID (code at end of post).

I shouldn't add it like:

ht.Add(s1.GetHashCode(), s1)

as the HashTable calls GetHashCode on the Key for me.

This should then store the Hash (The ID returned from GHC) as the Key and
the student as the object. However if you loop through the HashTable with:

Dim de As DictionaryEntry
For Each de In ht
Console.WriteLine("Key = {0}, Value = {1}", de.Key, de.Value)
Next

It shows that the student is stored as both the Key and the Value. Does this
mean that you should just add objects to the HashTable like:

Dim ht As New Hashtable
ht.Add(s1, Nothing)
ht.Add(s2, Nothing)
ht.Add(s3, Nothing)

Seems to me I have misunderstood something here and thw whole point of a
HashTable is to use a key to get to a value, was hoping somebody could clear
this up for me.
The purpose of a hashtable is to provide a fast lookup based on the
key. It achieves this lookup using the hash of the key value. But
remember, two keys may have the same hash value, so it has to store the
entire key to be sure it's located the correct item.

The best way to understand how it works is to use Reflector. Hashtable
is implemented in mscorlib, and the interesting things to look at are
Insert() and InitHash/GetHash. (Also look at ContainsKey() to
understand how the lookups work)

In your case, you're own objects can serve as their key, but in many
cases they will not - for instance, a hashtable containing a large
object relating to pending requests for a client may be keyed on the
ClientID. In your case, you could have chosen not to override
GetHashCode and instead called ht.Add(s1.StudentID,s1). This will allow
you to find an item using their StudentID, whereas in your example, you
need to have an Student object before you can search for... a Student
object. About the only thing you can do with your sample is, I believe,
use your hash table to test for membership (is the student in the hash
table), whereas with my example, I can determine membership and locate
the larger object.

Hope this helps,

Damien
 
N

Nick Hounsome

Typically you would use Add(s1.StudentId,s1) as this allows you to look up a
student given ONLY the student ID [without doing something stupid like
creating a student from the ID for the sole purpose of lookup]

Whilst it is a good idea to override GetHashCode to return student ID, the
hash code is not generally required to be unique so writing
Add(s1.GetHashCode(),s1) would still be bad as it would confuse people.

Once you have your student you could then have a different hashtable to look
up some other stuff to do with the student such as Add(s1,degreeName). In
this case the hash table would indeed keep a REFERENCE to the student and
call the GetHashCode method. This is often useful as it allows you to find
all the students in the relationship in one step rather than mapping from ID
to degreeName which would require you to get all the IDs and then lookup
them up in the "ID to student" table.

To keep the code nice and deal with the asymetric access you will want 2
methods (or equivalent):

void Add(Student s) { ht.Add(s.StudentId,s); }
Student GetStudent(int id) { return ht[id]; }
 
J

John Davis

Hi Guys

Thanks for your input, I had a look using Reflector and it is a lot clearer
now.

Regarding your suggestion to add a student using

ht.Add(s1.StudentID, s1)

This is obviously a good approach if there is an obvious property or public
field to use as the hash but I just wondered if a class designer has
overriden GetHashCode then this would mean that the user of said object
wouldn't have to know which field property to use and could just call.

ht.Add(s1.GetHashCode(), s1)

Having said this the user of the object would need to know what GetHashCode
actually returned in order to do a lookup so they might as well just add it
using that property/field in the first place. I guess like you say Nick,
GetHashCode is only useful if your are storing the student as the key in
order to store a different object (such as there exams results/degree name)
in the value.

Cheers again for your help on this matter.

John


Nick Hounsome said:
Typically you would use Add(s1.StudentId,s1) as this allows you to look up
a student given ONLY the student ID [without doing something stupid like
creating a student from the ID for the sole purpose of lookup]

Whilst it is a good idea to override GetHashCode to return student ID, the
hash code is not generally required to be unique so writing
Add(s1.GetHashCode(),s1) would still be bad as it would confuse people.

Once you have your student you could then have a different hashtable to
look up some other stuff to do with the student such as
Add(s1,degreeName). In this case the hash table would indeed keep a
REFERENCE to the student and call the GetHashCode method. This is often
useful as it allows you to find all the students in the relationship in
one step rather than mapping from ID to degreeName which would require you
to get all the IDs and then lookup them up in the "ID to student" table.

To keep the code nice and deal with the asymetric access you will want 2
methods (or equivalent):

void Add(Student s) { ht.Add(s.StudentId,s); }
Student GetStudent(int id) { return ht[id]; }


Damien said:
The purpose of a hashtable is to provide a fast lookup based on the
key. It achieves this lookup using the hash of the key value. But
remember, two keys may have the same hash value, so it has to store the
entire key to be sure it's located the correct item.

The best way to understand how it works is to use Reflector. Hashtable
is implemented in mscorlib, and the interesting things to look at are
Insert() and InitHash/GetHash. (Also look at ContainsKey() to
understand how the lookups work)

In your case, you're own objects can serve as their key, but in many
cases they will not - for instance, a hashtable containing a large
object relating to pending requests for a client may be keyed on the
ClientID. In your case, you could have chosen not to override
GetHashCode and instead called ht.Add(s1.StudentID,s1). This will allow
you to find an item using their StudentID, whereas in your example, you
need to have an Student object before you can search for... a Student
object. About the only thing you can do with your sample is, I believe,
use your hash table to test for membership (is the student in the hash
table), whereas with my example, I can determine membership and locate
the larger object.

Hope this helps,

Damien
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top