Why is concept of equals and operator== implemented this way?

C

cody

Why can I overload operator== and operator!= separately having different
implementations and additionally I can override equals() also having a
different implementation.

Why not forbid overloading of == and != but instead translate each call
of objA==objB automatically in System.Object.Equals(objA, objB).

This would remove inconsistencies like

myString1==myString2

and

(object)myString1==(object)myString2

having different results.

also, should operator!= not always return the negated value of!=?
So why is it good for?

I would like to understand the technical reason for that, if there was any.
 
B

Barry Kelly

cody said:
Why can I overload operator== and operator!= separately having different
implementations and additionally I can override equals() also having a
different implementation.

Why not forbid overloading of == and != but instead translate each call
of objA==objB automatically in System.Object.Equals(objA, objB).

'==' is resolved statically, while Object.Equals(object) is determined
by dynamic dispatch. The simplest answer is: performance, both in
textual code size (see below) and in runtime execution.
This would remove inconsistencies like

myString1==myString2

and

(object)myString1==(object)myString2

having different results.

It would also make the '==' operator always perform a dynamic dispatch
(i.e. a virtual method call). Not only would it be slower, but it would
also make it harder to (for example) program safely with multithreaded
locks.

When you call a virtual method, you've got no idea what code is going to
be called. That's one of the reasons why it isn't recommended to call
virtual methods while holding a lock. If things were to work this way,
no one could safely compare object references while holding locks,
unless they used the cumbersome object.ReferenceEquals(object,object)
method.

Also, it is not always possible to override the correct Equals. If the
two sides of the '==' are of different type, which side would the
compiler generate a call to .Equals for? What if one side is a Framework
type like int, and the other is a complex number type?
also, should operator!= not always return the negated value of!=?
So why is it good for?

This question I cannot answer, since I've never had a reason to have a
different implementation.

-- Barry
 
C

cody

Barry said:
'==' is resolved statically, while Object.Equals(object) is determined
by dynamic dispatch. The simplest answer is: performance, both in
textual code size (see below) and in runtime execution.

even if making unrealistic micro benchmarks with empty method bodies
there is almost no difference between normal methods and virtual ones.
When you call a virtual method, you've got no idea what code is going to
be called. That's one of the reasons why it isn't recommended to call
virtual methods while holding a lock. If things were to work this way,
no one could safely compare object references while holding locks,
unless they used the cumbersome object.ReferenceEquals(object,object)
method.

This sounds logical at first, but in fact, but looking into microsofts
shared source cli implementation, almost all implementations of
operator== is calling Equals() and the very few ones not doing this are
accessing properties of the object which are in some cases also virtual
or calling virtual methods internally.
Also, it is not always possible to override the correct Equals. If the
two sides of the '==' are of different type, which side would the
compiler generate a call to .Equals for? What if one side is a Framework
type like int, and the other is a complex number type?

Well, I see the problem. But in reality I've never seen an
implementation of operator== for different types. Did you?
I cannot imagine a scenario where this would be necessary, given the
fact that objects of different types should per definition never be
considered to be equal, and even if, you can implement an implicit
conversion operator for that.
This question I cannot answer, since I've never had a reason to have a
different implementation.

Indeed very strange. even comparisons with NaN or nullable types
involved, operator!= always yields the opposite value of the
corresponding operator==:

Console.WriteLine(double.NaN == double.NaN); // false
Console.WriteLine(double.NaN != double.NaN); // true
Console.WriteLine(float.NaN == float.NaN); // false
Console.WriteLine(float.NaN != float.NaN); // true
Console.WriteLine((float?)1 == (float?)null); // false
Console.WriteLine((float?)1 != (float?)null); // true

In conclusion, I still feel that the concepts of ==,!= and equals could
have been implemented in a simpler and more logical way in .NET.
Maybe the current implementation may be 1% faster than the simpler one
and you can do very strange stuff like == and != returning the same
value or making objects of different types equal but if that makes 99.9%
of all normal cases harder to write and to maintain this is too much to pay.
 
B

Barry Kelly

cody said:
even if making unrealistic micro benchmarks with empty method bodies
there is almost no difference between normal methods and virtual ones.

Unless it's been overloaded, '==' isn't a method call.
This sounds logical at first, but in fact, but looking into microsofts
shared source cli implementation, almost all implementations of
operator== is calling Equals() and the very few ones not doing this are
accessing properties of the object which are in some cases also virtual
or calling virtual methods internally.

What you seem to be advocating is to turn *every* *usage* of '==' with
reference types into a method call, but it isn't currently. What you've
been looking up in the SSCLI is the overloaded '==' operator on various
types. The built-in '==' operator for reference types has different
semantics.
Well, I see the problem. But in reality I've never seen an
implementation of operator== for different types. Did you?

Yes. I've implemented them, for my own Date type.
I cannot imagine a scenario where this would be necessary, given the
fact that objects of different types should per definition never be
considered to be equal, and even if, you can implement an implicit
conversion operator for that.

There was a conversion operator too, but why convert when you can
compare as-is?
In conclusion, I still feel that the concepts of ==,!= and equals could
have been implemented in a simpler and more logical way in .NET.
Maybe the current implementation may be 1% faster than the simpler one
and you can do very strange stuff like == and != returning the same
value or making objects of different types equal but if that makes 99.9%
of all normal cases harder to write and to maintain this is too much to pay.

I will submit this contrived micro-benchmark in favour of the current
situation, if only to point out some of the overhead of virtual method
calls on Equals etc.:

---8<---
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Text;
using System.Runtime.CompilerServices;

class SomeObject
{
[MethodImpl(MethodImplOptions.NoInlining)]
public override bool Equals(object obj)
{
// This would be the only way to perform reference equality
// checks if the proposed idea was implemented.
return object.ReferenceEquals(this, obj);
}

[MethodImpl(MethodImplOptions.NoInlining)]
public static bool operator==(SomeObject left, SomeObject right)
{
return object.ReferenceEquals(left, right);
}

[MethodImpl(MethodImplOptions.NoInlining)]
public static bool operator!=(SomeObject left, SomeObject right)
{
return !object.ReferenceEquals(left, right);
}

[MethodImpl(MethodImplOptions.NoInlining)]
public override int GetHashCode()
{
return base.GetHashCode();
}
}

class App
{
delegate void Method();

static void Benchmark(int iterations, string label, Method method)
{
method(); // warmup

Stopwatch start = Stopwatch.StartNew();
for (int i = 0; i < iterations; ++i)
method();
Console.WriteLine("{0,20} : {1,6:f3} ({2} iterations)",
label,
start.ElapsedTicks / (double) Stopwatch.Frequency,
iterations);
}

static void Main()
{
const int iterCount = 30;
const int objectCount = 10000000;

// To eliminate "unused value" optimizations.
int equalCount = 0;

SomeObject[] list = new SomeObject[objectCount];
for (int i = 0; i < objectCount; ++i)
list = new SomeObject();

Benchmark(iterCount, "Overloaded '=='", delegate
{
for (int i = 0; i < list.Length; ++i)
if (list == null)
++equalCount;
});

Benchmark(iterCount, "Overridden 'Equals'", delegate
{
for (int i = 0; i < list.Length; ++i)
if (list.Equals(null))
++equalCount;
});

Benchmark(iterCount, "Object '=='", delegate
{
for (int i = 0; i < list.Length; ++i)
if ((object) list == null)
++equalCount;
});

Benchmark(iterCount, "RefEquals", delegate
{
for (int i = 0; i < list.Length; ++i)
if (object.ReferenceEquals(list, null))
++equalCount;
});

Console.WriteLine("Total EqualCount: {0}", equalCount);
}
}
--->8---

On my system:

---8<---
Overloaded '==' : 1.582 (30 iterations)
Overridden 'Equals' : 2.662 (30 iterations)
Object '==' : 0.609 (30 iterations)
RefEquals : 0.896 (30 iterations)
Total EqualCount: 0
--->8---

Of limited applicability, very contrived, usual disclaimers, etc. etc...

-- Barry
 
C

cody

Barry said:
Unless it's been overloaded, '==' isn't a method call.

Yes that is true. If == couldn't be overloaded the system would always
have to call Object.Equals() and this could not optimized to a single IL
instruction as it now is. One either would have to call
Object.ReferenceEquals() explicitly (which then could be inlined),
or an operator=== like it is in PHP would have to be introduced.

System.Object.Equals(objA, objB) only calls Equals of the first
parameter. So you always would have to write if (myComplexNumber==1.5).
I think it is very intuitive having the own or most specialized type on
the left side, at least for me. Sure, when making the call the other way
around then always false is returned. But when the types are known at
compile time, an appropriate Equals method doesn't exist and the
compiler determines an appropriate Equals method exists for the other
way around then a warning could be issued.
Yes. I've implemented them, for my own Date type.

Which type does it compare with? Your own type with System.Datetime?
Now as you say it, I can remember doing exactly the same thing because
we needed a Date class for our project which can be null. It was so long
ago I couldn't remember it, sorry :)
But this is an interesting case. We have different types
(System.DateTime and our own), but each represents exactly the same
entity, therefore users expect that they are comparable with each other.
There was a conversion operator too, but why convert when you can
compare as-is?

Well, I can see there can indeed be a big performance difference between
always creating a new object just for equality testing and just
comparing some fields against each other, but in the date case you just
encapsulate a Datetime so no creation of a new object is necessary.
In conclusion, I still feel that the concepts of ==,!= and equals could
have been implemented in a simpler and more logical way in .NET.
Maybe the current implementation may be 1% faster than the simpler one
and you can do very strange stuff like == and != returning the same
value or making objects of different types equal but if that makes 99.9%
of all normal cases harder to write and to maintain this is too much to pay.

I will submit this contrived micro-benchmark in favour of the current
situation, if only to point out some of the overhead of virtual method
calls on Equals etc.:

---8<---
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Text;
using System.Runtime.CompilerServices;

class SomeObject
{
[MethodImpl(MethodImplOptions.NoInlining)]
public override bool Equals(object obj)
{
// This would be the only way to perform reference equality
// checks if the proposed idea was implemented.
return object.ReferenceEquals(this, obj);
}

[MethodImpl(MethodImplOptions.NoInlining)]
public static bool operator==(SomeObject left, SomeObject right)
{
return object.ReferenceEquals(left, right);
}

[MethodImpl(MethodImplOptions.NoInlining)]
public static bool operator!=(SomeObject left, SomeObject right)
{
return !object.ReferenceEquals(left, right);
}

[MethodImpl(MethodImplOptions.NoInlining)]
public override int GetHashCode()
{
return base.GetHashCode();
}
}

class App
{
delegate void Method();

static void Benchmark(int iterations, string label, Method method)
{
method(); // warmup

Stopwatch start = Stopwatch.StartNew();
for (int i = 0; i < iterations; ++i)
method();
Console.WriteLine("{0,20} : {1,6:f3} ({2} iterations)",
label,
start.ElapsedTicks / (double) Stopwatch.Frequency,
iterations);
}

static void Main()
{
const int iterCount = 30;
const int objectCount = 10000000;

// To eliminate "unused value" optimizations.
int equalCount = 0;

SomeObject[] list = new SomeObject[objectCount];
for (int i = 0; i < objectCount; ++i)
list = new SomeObject();

Benchmark(iterCount, "Overloaded '=='", delegate
{
for (int i = 0; i < list.Length; ++i)
if (list == null)
++equalCount;
});

Benchmark(iterCount, "Overridden 'Equals'", delegate
{
for (int i = 0; i < list.Length; ++i)
if (list.Equals(null))
++equalCount;
});

Benchmark(iterCount, "Object '=='", delegate
{
for (int i = 0; i < list.Length; ++i)
if ((object) list == null)
++equalCount;
});

Benchmark(iterCount, "RefEquals", delegate
{
for (int i = 0; i < list.Length; ++i)
if (object.ReferenceEquals(list, null))
++equalCount;
});

Console.WriteLine("Total EqualCount: {0}", equalCount);
}
}
--->8---

On my system:

---8<---
Overloaded '==' : 1.582 (30 iterations)
Overridden 'Equals' : 2.662 (30 iterations)
Object '==' : 0.609 (30 iterations)
RefEquals : 0.896 (30 iterations)
Total EqualCount: 0
--->8---


Having a closer look comparing my and your benchmark I noticed that I
was calling the virtual methods always on the same object in a loop so
the jit could cache the method pointer in a register, no wonder why
virtual methods are nearly as fast as normal ones then :)

Running the release exe without the IDE (using .net 2.0) the
ReferenceEquals always runs faster than Object== on my computer.
But if I remove the NoInline attribute, ReferenceEquals is slower as
Object==.

overridden : 3,401 (30 iterations)
normal : 2,725 (30 iterations)
static : 1,212 (30 iterations)
Overloaded '==' : 0,716 (30 iterations)
Overridden 'Equals' : 3,428 (30 iterations)
Object '==' : 1,104 (30 iterations)
RefEquals : 0,898 (30 iterations)
Total EqualCount: 0
 
C

Chris Nahr

Why not forbid overloading of == and != but instead translate each call
of objA==objB automatically in System.Object.Equals(objA, objB).

1. Upcasting to object involves boxing & unboxing for value types, and
that's a very expensive operation. You absolutely want to avoid that
for something as frequently used as an equality test.

2. Even with reference types, you're overriding a method that takes
object parameters, so your own Equals override will have to perform
cumbersome type checking on the supplied objects.

Now that we have generics, both issues could be avoided with a generic
Equals method; unfortunately that was not available back when the CLR
was designed. So strongly-typed equality tests were necessary back
then, and strongly-typed methods can't be inherited from Object.
also, should operator!= not always return the negated value of!=?

Yeah, that's a good point. operator!= could be auto-generated.
 
C

cody

Chris said:
1. Upcasting to object involves boxing & unboxing for value types, and
that's a very expensive operation. You absolutely want to avoid that
for something as frequently used as an equality test.

structs do not have a default implementation of operator==. If you want
some you could implement a suitable equals method having the appropriate
value type as parameter and the compiler translates a call of
myStruct1==myStruct2 into a call to myStruct1.Equals(myStruct2) so that
MyStruct.Equals(MyStruct obj) is called.
2. Even with reference types, you're overriding a method that takes
object parameters, so your own Equals override will have to perform
cumbersome type checking on the supplied objects.

Sure it will have to, but honestly how often do you use == for objects
that are *no* value types and *no* strings (which are sealed) both
require no type checks. In collection methods like IndexOf the non-type
safe version of Equals is used and type checks have to be performed anyway.
For most other cases you want reference equality anyway for which an
operator=== could be used.

And as I said, nobody prevents you from implementing a strongly typed
version of Equals which will then be picked by the compiler and here
also is no type check necessary.
Now that we have generics, both issues could be avoided with a generic
Equals method; unfortunately that was not available back when the CLR
was designed. So strongly-typed equality tests were necessary back
then, and strongly-typed methods can't be inherited from Object.

Iam not sure in which way generics can help here.
Yeah, that's a good point. operator!= could be auto-generated.

The same applies to operator< and operator>= vs operator> and
operator<= or operator==(Y a,X b) vs operator==(X b,Y a) or
operator'true' vs operator'false'.
 
B

Barry Kelly

cody said:
Sure it will have to, but honestly how often do you use == for objects
that are *no* value types and *no* strings (which are sealed) both
require no type checks. In collection methods like IndexOf the non-type
safe version of Equals is used and type checks have to be performed anyway.

Referential equality for mutable types has completely different
semantics from value equality, which is used for strings. With the
greatest respect, I use referential equality far, far, far more often
than value equality.

Taking something out of thin air:

Foo f = new Foo();
Foo f2 = new Foo();
Console.WriteLine(f == f2);

How often, in your programs, do you require this to print "True" on the
console? Speaking for myself, almost *never*.
For most other cases you want reference equality anyway for which an
operator=== could be used.

Here, it seems to me, you are introducing a new operator in order to do
what most people already use '==' for.
And as I said, nobody prevents you from implementing a strongly typed
version of Equals which will then be picked by the compiler and here
also is no type check necessary.

But: this is advocating making an expensive boxing operation the
default. That approach might be more reasonable in a more dynamic
language like Lisp or Python, but it isn't right in a language with a
C-based history.

-- Barry
 
J

Jon Skeet [C# MVP]

Barry Kelly said:
Referential equality for mutable types has completely different
semantics from value equality, which is used for strings. With the
greatest respect, I use referential equality far, far, far more often
than value equality.

Taking something out of thin air:

Foo f = new Foo();
Foo f2 = new Foo();
Console.WriteLine(f == f2);

How often, in your programs, do you require this to print "True" on the
console? Speaking for myself, almost *never*.

Actually, I rarely compare things which *don't* overload equality. I
rarely check for references being identical, but I often check for
strings being equal, for instance.

I often compare value types for equality, however.

Out of interest, what do you tend to use reference identity tests for?
 
M

Mark Wilden

Barry Kelly said:
Referential equality for mutable types has completely different
semantics from value equality, which is used for strings.

Hmmm...I thought all identical strings were folded to the same reference. Or
am I thinking of a completely different language?

///ark
 
C

cody

Barry said:
Referential equality for mutable types has completely different
semantics from value equality, which is used for strings. With the
greatest respect, I use referential equality far, far, far more often
than value equality.

Taking something out of thin air:

Foo f = new Foo();
Foo f2 = new Foo();
Console.WriteLine(f == f2);

How often, in your programs, do you require this to print "True" on the
console? Speaking for myself, almost *never*.


Here, it seems to me, you are introducing a new operator in order to do
what most people already use '==' for.

Which IMO is bad style. You never know whether the given objects have an
overloaded operator or not, maybe one is added later which will change
semantics of your code. Currently you will have to use
Object.ReferenceEquals or (object)obj1==(object)obj2 to ensure reference
equality is done.
But: this is advocating making an expensive boxing operation the
default. That approach might be more reasonable in a more dynamic
language like Lisp or Python, but it isn't right in a language with a
C-based history.

Microsoft recommends in the newer guidelines to always add a strongly
typed Equals to classes anyway.

The current implementation of operator overloading doesn't allow generic
code making use of them since static methods like operators are, cannot
be specified in interfaces.

public T Add<T>(T a, T b)
where a,b : IAddable<T> // interface may implement + and -
{
return a+b; // internally calls a.Add(b)
}

This way, users can make interfaces which implements the operators they
need for specific operations. In languages like C++, D (Digital Mars),
Python and Ruby are also normal methods (C++ also allows static ones for
operators).
 
J

Jon Skeet [C# MVP]

Mark Wilden said:
Hmmm...I thought all identical strings were folded to the same reference. Or
am I thinking of a completely different language?

That's interning you're thinking of, and while it automatically applies
to string *literals* it certainly doesn't apply to strings in general.
For instance:

using System;

class Test
{
static void Main()
{
string x = "hello".Substring (0, 1);
string y = "hi".Substring (0, 1);
Console.WriteLine (x==y);
Console.WriteLine (object.ReferenceEquals(x, y));
}
}
 
M

Mark Wilden

Jon Skeet said:
That's interning you're thinking of, and while it automatically applies
to string *literals* it certainly doesn't apply to strings in general.

Thanks. I knew I was thinking of something...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top