object: == vs. Equals()

Z

Zeng

I'm trying to comparing 2 objects (pointer to object) to see if they are the
"same" as each other. Here is what the definition of being the "same"

object type for both objects, object 1, object 2, same?
-----------------------------------------------------------
Decimal 199677 199677 yes
String "hello" "hello" yes
null null
yes

I looks like I have to do 2 comparisons to cover this definition:
public bool compare( object obj1, object obj2 )
{
if( obj1 == obj2 ) // this won't detect 2 strings being the same
return true;
// assuming obj1 is not null
if( obj1.Equals( obj2 ) ) // this won't detect 2 integers being the
same!!!
return true;
}

Is that because we have value type and reference type? Notice that I haven't
consider obj1 being null yet,if not, the code can get messier ? I'm still
not believing it's that complicated to do the simple task in C#. Shouldn't
the string and integer classes both override the Equals() method to return
true when the values are the same? Would someone out there please correct
me if I'm wrong ? Thanks!
 
M

Mountain Bikn' Guy

1. The CLR uses string interning. This means that all usages of the same
string in a program refer to a single instace of a string object. This needs
to be considered when writing programs that test string equality. See Eric
Gunnerson's book for more examples. It can particularly come into play when
you cast a string to an object, as it seems you are doing.

I believe you will find it much simpler to compare strings and Decimals
using the standard approach in the.NET framework. Is there some reason you
have to take the approach shown in your code? In case you aren't familiar
with some of these points, I'm listing a couple more items below that might
interest you.

2. Any class that overrides Equals() should override GetHashCode()

3. System.ValueType contains a version of Equals() that works on all value
types, but it is not optimally efficient because it uses reflection.
Therefore, Gunnerson recommends that an implementation of Equals be written
for all value types. Personally, I am not aware that this is typically done.
Does anyone else know if Gunnerson's recommendation is to be taken
literally?

4. There are some subtle issues when overriding Equals in inheritance
hierarchies. I think it is worth looking at Gunnerson's book to study the
examples.

Hope that helps some.
 
Z

Zeng

Would you be able to show me how you would write a compare function that
compares 2 objects which in my case are always of the same type (both
decimal , both String, or both null)?

Thanks!
 
F

Frank Oquendo

Zeng said:
I looks like I have to do 2 comparisons to cover this definition:
public bool compare( object obj1, object obj2 )
{
if( obj1 == obj2 ) // this won't detect 2 strings being the same
return true;
// assuming obj1 is not null
if( obj1.Equals( obj2 ) ) // this won't detect 2 integers being
the same!!!
return true;
}

I'm not sure how you're getting your results. I get True when comparing
two strings regardless of which method I use. The same goes for ints.

--
There are 10 kinds of people. Those who understand binary and those who
don't.

http://code.acadx.com
(Pull the pin to reply)
 
Z

Zeng

Are you sure you got the worst case of 2 instances of an object? I remember
reading somewhere that if you do
string s1 = "hello";
string s2 = "hello";

you still have the same string internally until you modify it.
 
F

Frank Oquendo

Zeng said:
you still have the same string internally until you modify it.

Strings are immutable and they're compared using value semantics so
unless two strings contain the exact same text, and are thus references
to the same string, they are not equal.

--
There are 10 kinds of people. Those who understand binary and those who
don't.

http://code.acadx.com
(Pull the pin to reply)
 
Z

Zeng

I'm seeing strings contain the same values and it fails to be the same in
one of the methods. The sources of these strings are:
String 1: from DataRow object, a result of reading a string field from db.
String 2: via PropertyInfo.GetValue of a property returning a string.

Looking at the debugger, both objects show up as string type with the same
value/content. Yet, if they both have empty string "" then both methods are
correct. But when they are both "Hello" for example, == would says false
when Equals( ) would say true!
 
M

mikeb

Zeng said:
I'm trying to comparing 2 objects (pointer to object) to see if they are the
"same" as each other. Here is what the definition of being the "same"

object type for both objects, object 1, object 2, same?
-----------------------------------------------------------
Decimal 199677 199677 yes
String "hello" "hello" yes
null null
yes

I looks like I have to do 2 comparisons to cover this definition:
public bool compare( object obj1, object obj2 )
{
if( obj1 == obj2 ) // this won't detect 2 strings being the same
return true;
// assuming obj1 is not null
if( obj1.Equals( obj2 ) ) // this won't detect 2 integers being the
same!!!
return true;
}

Is that because we have value type and reference type? Notice that I haven't
consider obj1 being null yet,if not, the code can get messier ? I'm still
not believing it's that complicated to do the simple task in C#. Shouldn't
the string and integer classes both override the Equals() method to return
true when the values are the same? Would someone out there please correct
me if I'm wrong ? Thanks!

Please read:

http://msdn.microsoft.com/library/en-us/cpgenref/html/cpconEquals.asp


http://msdn.microsoft.com/library/en-us/csref/html/vclrfequalityoperator.asp

and related documentation.

Note that:

1) value types in the .NET Framework perform value equality, so
comparing doubles, decimals, and ints, etc., behave as you would expect.

2) the String class overrides the Equals() method so that it also
performs a value comparison - ie., two separate instances of string that
contain the same sequence of characters will return true when tested
using the Equals() method. So strings behave very much like value types
as far as the Equals() method is concerned. This has nothing to do with
string interning.

3) null references will always return false when compared with
non-null references, and will always return true when null is compared
with null.

4) someone who implements an override of the Equals method is
certainly able to break the rules - however that class implementation
will also not work very well in the framework (and would be considered
bug-ridden by anyone trying to use it).
 
F

Frank Oquendo

Zeng said:
I'm seeing strings contain the same values and it fails to be the
same in one of the methods. The sources of these strings are:
String 1: from DataRow object, a result of reading a string field
from db.

Unless you're using a strongly typed DataSet, DatRow[index] returns an
object, not a string. Just because your object may contain a string is
no reason to assume it's equal to an instrinsic string object.
String 2: via PropertyInfo.GetValue of a property returning
a string.

Same deal applies here.
Looking at the debugger, both objects show up as string type with the
same value/content. Yet, if they both have empty string "" then both
methods are correct. But when they are both "Hello" for example, ==
would says false when Equals( ) would say true!

With reference types Equals will check to see if two references point to
the same object. In your case, I'd expect it ro return true if two
objects represent a value from the same row and field. As for '==', I'd
have no idea what to expect when comparing two references.

--
There are 10 kinds of people. Those who understand binary and those who
don't.

http://code.acadx.com
(Pull the pin to reply)
 
M

Matthew W. Jackson

Object.Equals is virtual. However the == operator for objects is NOT
virtual, nor does it call the virtual Object.Equals function. If you call
the == operator on two values that are only known to be Objects, it uses
reference equality (even for boxed values).

The only reason Object.Equals won't detect two integers being equal is if
they are of different types. The rules for Object.Equals state that if they
are of different types, they are not equal. For example, 5 as an Int32 is
NOT equal to 5 as an Int16. If you are using numeric literals anywhere, you
may want to explictly specify their type (using the one-letter suffixes or
casts). If that is the case, then Int32.Equals won't work either.

Run this example to see what I mean:

int x = 5;
short y = 5;
object boxX = x;
object boxY = y;
Console.WriteLine(x == y);
Console.WriteLine(boxX == boxY);
Console.WriteLine(x.Equals(y));
Console.WriteLine(boxX.Equals(boxY));

You should end up with True, False, False, False

Change the variable y to "int" and you will get True, False, True, True.

This is because, like I said earlier, == on Objects is for reference
equality, and the behavior is not virtual. If that's what you want, you
could write a wrapper for Object where the == operator calls the virtual
Object.Equals....but PLEASE don't do that. You will confuse anyone looking
at your code.

Consider adding something like this to your comparison function to make
sure:

System.Diagnostics.Debug.WriteLine(obj1.GetType().ToString());
System.Diagnostics.Debug.WriteLine(obj2.GetType().ToString());

And then check the output in VS's console. I'm pretty sure that 199677 and
199677 are not equal because of this.

--Matthew W. Jackson
 
S

Sherif ElMetainy

Hello

You don't have to write your own method, the framework already defines one
for you, which is the static version of Object.Equals
just call
bool eq = Object.Equals(obj1, obj2);

in case you want to know how to write the method, here is the way (which is
very close to the way you wrote in your initial post, and also this is how
the static bject.Equals is implemented)

bool compare(Object obj1, Object obj2) {
// check if they refer to the same object or are both null
if (obj1==obj2) {
return true;
}
// since they are not both null, then if any one is null the method
should return false
if (obj1==null || obj2==null) {
return false;
}
// calls the instance version of Equals, string and decimal both
override Equals
return obj1.Equals(obj2);
}

Best regards,
Sherif
 
S

Sherif ElMetainy

Hello

2 strings can be equal when they don't refer to the same string.
Here is an example

string s1 = "hello";
string s2 = "hel";
string s3 = s2 + "lo";
Console.WriteLine(Object.ReferenceEquals(s1, s3)); // this should display
false
Console.WriteLine(Object.Equals(s1, s3)); // this should display true
s3 = String.Intern(s3);
Console.WriteLine(Object.ReferenceEquals(s1, s3)); // this should display
true
Console.WriteLine(Object.Equals(s1, s3)); // this should display true

Best regards,
Sherif
 
F

Frank Oquendo

Sherif said:
Hello

2 strings can be equal when they don't refer to the same string.
Here is an example

string s1 = "hello";
string s2 = "hel";
string s3 = s2 + "lo";

"hel" + "lo" = "hello", "hello" == "hello" = true.

s3 refers to the same string as s1. Am I missing something?

--
There are 10 kinds of people. Those who understand binary and those who
don't.

http://code.acadx.com
(Pull the pin to reply)
 
S

Sherif ElMetainy

Hello

for "hel" + "lo", the c# compiler makes the string contatenation at compile
time, so this is exactly like "hello", and c# compiler makes all instance of
the literal string "hello" refer to same string to save memory and reduce
the assembly size.

but in my example, the concatenation made at runtime not at compile time.

string s2 = "hel";
string s3 = s2 + "lo";
c# compiler isn't smart enough to make s3 refer to the literal "hello" at
compile time, so the value of s3 is calculated at runtime, and there will be
2 copies of the string "hello" in memory, the literal version, stored in s1
and the calculated version stored in s3.
This is why object.ReferenceEquals(s1, s3) would return false, because they
are referring to 2 different copies.
object.Equals would return true, because the string class overrides equals,
and in this case the 2 strings are compared character by character.
s1 == s3 would also be true, because the string class overloads the ==
operator

Best regards,
Sherif
 
J

Jon Skeet [C# MVP]

Mountain Bikn' Guy said:
1. The CLR uses string interning. This means that all usages of the same
string in a program refer to a single instace of a string object.

Not quite. It means that all usages of the same string *literal* refer
to a single instance. So:

string a = "hello";
string b = "hello";
string c = "hel"+"lo"; // Constant concatenation

StringBuilder builder = new StringBuilder();
builder.Append("hello");

string d = builder.ToString();

After this, a, b and c will all be references to the same object, but d
will be a reference to a different string instance with the same data.

a==d will still return true due to operator overloading, but
Object.ReferenceEquals(a, d) will return false.
 
J

Jon Skeet [C# MVP]

Frank Oquendo said:
Strings are immutable and they're compared using value semantics so
unless two strings contain the exact same text, and are thus references
to the same string, they are not equal.

string x = "hello";

StringBuilder sb = new StringBuilder();
sb.Append ("hello");

string y = sb.ToString();

x and y are now two references to different strings but which have the
same data. Test to show that:

using System;
using System.Text;

class Test
{
static void Main()
{
string x = "hello";

StringBuilder sb = new StringBuilder();
sb.Append ("hello");

string y = sb.ToString();

Console.WriteLine (Object.ReferenceEquals(x, y));
}
}

Note that x==y will still print true due to operator overloading.
 
J

Jon Skeet [C# MVP]

Frank Oquendo said:
With reference types Equals will check to see if two references point to
the same object.

That's only true when it's not overridden.
In your case, I'd expect it ro return true if two
objects represent a value from the same row and field.

Strings don't know where they came from (and neither do decimals etc).
As for '==', I'd
have no idea what to expect when comparing two references.

It depends on operator overloading. By default, it returns whether or
not the references are equal, i.e. whether or not they refer to the
same object (or both refer to null). They can be overloaded though, as
string happens to - but operator overloading isn't applied
polymorphically, so if we have:

string x = MethodReturningHello();
string y = MethodReturningHello();

object a = x;
object b = y;

where a reference to a different string is returned each time by
MethodReturningHello(), but with the same data in, we would have:

x==y - true due to operator overloading
a==b - false, as object== just compares references
x.Equals(y) - true
a.Equals(b) - true as Equals is invoked polymorphically
 
J

Jon Skeet [C# MVP]

Frank Oquendo said:
"hel" + "lo" = "hello", "hello" == "hello" = true.

s3 refers to the same string as s1. Am I missing something?

Yes - s3 *doesn't* refer to the same string as s1. It would do if it
had been written as:

string s3 = "hel" + "lo";

because then the whole expression would be a constant.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top