Comment on how to uniquely track your objects in C# / hash table /get hash code

R

raylopez99

That what Jon meant when he said that it depends on what you mean by
"equality", and there are numerous pitfalls here. For instance, there
are two version of Equals, one is static (in the Object class), the
other is an instance method of your own objects. It is customary to
never override the first (so that object.Equals(o1, o2) will always
tell you whether the object is indeed the same object - read : has the
same place in memory, and will be equivalent to ReferenceEquals for
reference types), and occasionaly override the second (often together
with the == operator) to accomodate for the case you're stating
(comparing surfaces of two circles).

Another pitfall : ReferenceEquals will always return false for value
types. i.e. :

int a = 5;
int b = a;
Console.WriteLine(object.ReferenceEquals(a,b));

...will yield "false", because of the boxing that occurs on value
types in the ReferenceEquals method.

The thing is, as you can see, "being equal" can have several meanings.
That's why I usually never override Equals or ==, but provide a method
with a stupid/evocative name like "Circle.AreSurfacesEqual(o1, o2)".
It has the advantage of removing the ambiguity you're talking about.
That's just me, though.

Yes, good stuff here, I've added this quote of yours to my library of
C# stuff that I keyword search.
Yes, but as you can still discriminate the rows through the table name/
number, there won't be any collisions, in the same way you know that a
dog called "Snoop" is different from a cat named "Snoop", even though
they have the same name, the same number of legs, eyes... It's handled
in exactly the same way as the filesystem, if you have two files with
identical names in an identical directory structure, like this :

C:\Docs\SomeFolder\File1.txt
D:\Docs\SomeFolder\File1.txt

...you still know they're different files, because the full path
differs (even if only on one drive letter). if the *full* path was the
same, then it would be the same file. It's the same principle for DBs
and for memory management of variables.

Yes, sorry, my example was bad since it was not a true collision. But
my point is that unless you use GUIDs or Autoincrement or some such
number as your Primary key, it's hard to define uniqueness. However,
more relevant to C#, after going through this thread I feel better
that C# does indeed distinquish between data (variables, objects,
whatever) stored in different parts of memory for the .Equals method
and for ReferenceEquals (which I didn't even know existed as a method)
and doesn't rely on an internal number and/or hash number. And I
agree that overloading "=" or Equals or "+" operator as was the
fashion for C++ is just eye candy / syntatic sugar is not a good idea,
so I like your suggestion of Circle.AreSurfacesEqual.

RL
 
A

Arne Vajhøj

raylopez99 said:
I'm hardly an authority on dB's, having spent about a month learning
them (but did get a functional program using Visual Basic and Access
Jet as the dB), but I believe auto number / auto increment / identify
will be the same thing as GUID, except shorter, so you're back to
square one, in the sense that you could have collision between two
tables that use the same auto number and table name, but contain
distinct rows.

No.

Auto increment etc. just start with 1 and add 1 one for each row
inserted. And when the data type can not have a bigger value you get an
exception when inserting (some DB's at least) and if you use
a sufficiently large data type you will run out of disk
space before that. There will never be a collision.

GUID that are UUID version 4 has a collision probability
of 1/2^122. Which is very small, but still greater than zero.

They are different.

Arne
 
A

Arne Vajhøj

Pavel said:
Note that Object.ReferenceEquals() is precisely equivalent to just
using operator== for reference types which do not overload this
operator - which is 99% of them all.

Maybe even higher for classes in general.

But for classes stored in hash tables ...
I'm no expert on this, so take it with a grain of salt; but I recall
there was a question along these lines in the MCSD->MCPD upgrade exam
- about the number of items at which hashtable becomes more efficient
than a list - and while I do not remember the precise answer, it was
of that order of magnitude (i.e., 10^2).

I guess it would depend on the cost of the hash function (relative
to comparison).

Arne
 
R

raylopez99

Auto increment etc. just start with 1 and add 1 one for each row
inserted. And when the data type can not have a bigger value you get an
exception when inserting (some DB's at least) and if you use
a sufficiently large data type you will run out of disk
space before that. There will never be a collision.

Right, but does anybody use "auto increment" for a primary key? I
guess they do, same as a GUID but you have to add a unique table name
I guess. But, at the time I wrote this, I was thinking of so-called
"natural keys" and using them as your primary key, and the difficulty
of uniqueness using natural keys. And, at the time I wrote this, I
did not realize that "==" or "ReferenceEquals” is a mechanical
calculation that is 100% guaranteed not to produce collisions--I
thought that it involved GetHashCode/ Hash codes, so there was 1/2^122
or whatever chance of a collision.
GUID that are UUID version 4 has a collision probability
of 1/2^122. Which is very small, but still greater than zero.

Thanks for clearing that up.

RL
 
A

Arne Vajhøj

raylopez99 said:
Right, but does anybody use "auto increment" for a primary key?

Oh yes.

Try invite 10 database people to dinner and ask them whether
they prefer natural keys or surrogate keys, then you will see
a fight break out, be sure to remove any guns and knifes from
them before the fun starts.

Arne
 
P

Pavel Minaev

Try invite 10 database people to dinner and ask them whether
they prefer natural keys or surrogate keys, then you will see
a fight break out, be sure to remove any guns and knifes from
them before the fun starts.

Then ask whether GUIDs do indeed make good primary keys, but be ready
to duck, since all attention will be immediately diverted to you.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top