string compare problem

  • Thread starter Thread starter David zhu
  • Start date Start date
D

David zhu

I've got different result when comparing two strings
using "==" and string.Compare().
The two strings seems to have same value "1202002" in
the quick watch, and both have the same length 7 which I
have tried to print out by debug.writeline().
But the "==" operator results false, and string.Compare()
results true.
Somebody helps me!
 
David said:
I've got different result when comparing two strings
using "==" and string.Compare().
The two strings seems to have same value "1202002" in
the quick watch, and both have the same length 7 which I
have tried to print out by debug.writeline().
But the "==" operator results false, and string.Compare()
results true.
Somebody helps me!

Different character sets can generate different results. ASCII, UTF-8,
Unicode.

You should never use == for string comparisons -- only the string methods.
 
David zhu said:
I've got different result when comparing two strings
using "==" and string.Compare().
The two strings seems to have same value "1202002" in
the quick watch, and both have the same length 7 which I
have tried to print out by debug.writeline().
But the "==" operator results false, and string.Compare()
results true.
Somebody helps me!

String objects are references to the actual strings. As such, they hold the
address of a string.
If you use == to compare String objects, you are comparing addresses, not
the contents of the strings.
Unless you've created two references to the same string, s1 == s2 will never
be true.
Use the methods defined in the String class.
 
802.16a said:
Different character sets can generate different results. ASCII, UTF-8,
Unicode.

You should never use == for string comparisons -- only the string methods.

I don't believe the string in .net holds any encoding information. a string in .net is always unicode.
 
Peter van der Goes said:
If you use == to compare String objects, you are
comparing addresses, not the contents of the strings.

That doesn't appear to be true. Try this code.

string s1 = "hello", s2 = "hell";

if (s1 == s2 + "o")
{
Console.WriteLine("equal"); // <-- we get here
}

P.
 
802.16a said:
Different character sets can generate different results. ASCII, UTF-8,
Unicode.

As Daniel says, strings in .NET don't have encodings associated with
them - they're all just Unicode.
You should never use == for string comparisons -- only the string methods.

Care to give some reasons? Note that using == is identical to calling
the Equals method.

The difference between == and Compare is that Compare is culture-
sensitive whereas == isn't, btw.
 
Daniel Jin said:
can you post some code that will demonstrate your problem?

I can.

using System;

class Test
{
static void Main()
{
string x = "\u00e9";
string y = "e\u0301";
Console.WriteLine (String.Compare(x,y));
Console.WriteLine (x==y);
}
}

The first string is an e with an acute accent.
The second is an e followed by an acute accent combining character.
When compared with a culture-insensitive comparison (==) they're
different strings. When compared with a culture-sensitive comparison
(Compare) the result is 0 (equal).
 
Larry Osterman had a blog entry covering. but I think op's problem is a little more fundamental than that. at least that's how I interpreted it, I could be wrong. :)
 
Daniel Jin said:
Larry Osterman had a blog entry covering. but I think op's problem is
a little more fundamental than that. at least that's how I
interpreted it, I could be wrong. :)

It's kinda hard to tell, seeing as supposedly string.Compare was
returning true, when String.Compare always returns an int... We could
definitely do with more information. Of course, if the expression on
one side of the "==" was of type object rather than string, that would
explain it...
 
Daniel Jin said:
no, == operator is overloaded for string class to do value comparison, not
reference comparison.

Argh! My bad. Thinking in Java. Got to stop working in three languages
concurrently.
 
Peter said:
String objects are references to the actual strings. As such, they hold the
address of a string.
If you use == to compare String objects, you are comparing addresses, not
the contents of the strings.
Unless you've created two references to the same string, s1 == s2 will never
be true.
Use the methods defined in the String class.

The help file for the String class states "This operator is implemented
using the Equals method, which means the comparands are tested for a
combination of reference and value equality." What exactly does this
mean? I use (stringa == stringb) all of the time without issue, and the
examples do exactly the same thing.

However, Equals() does not say the same as above. So what exactly is the
difference?

I am writing this because I disagree with your statement "If you use ==
to compare String objects, you are comparing addresses, not the contents
of the strings."

This is clearly untrue both in practice and in the help file, although
it does seem there is some nuance that needs to be understood.

David Logan

From MSDN itself, describing the String class:

Determines whether two specified String objects have the same value.

[Visual Basic]
returnValue = String.op_Equality(a, b)
[C#]
public static bool operator ==(
string a,
string b
);
[C++]
public: static bool op_Equality(
String* a,
String* b
);
[JScript]
returnValue = a == b;

[Visual Basic] In Visual Basic, you can use the operators defined by a
type, but you cannot define your own. You can use the Equals method
instead of the String equality operator.

[JScript] In JScript, you can use the operators defined by a type, but
you cannot define your own.
Arguments [Visual Basic, JScript]

a
A String or a null reference (Nothing in Visual Basic).
b
A String or a null reference (Nothing in Visual Basic).

Parameters [C#, C++]

a
A String or a null reference (Nothing in Visual Basic).
b
A String or a null reference (Nothing in Visual Basic).

Return Value

true if the value of a is the same as the value of b; otherwise, false.
Remarks

This operator is implemented using the Equals method, which means the
comparands are tested for a combination of reference and value equality.
The comparison is case-sensitive.
 
David Logan said:
This is clearly untrue both in practice and in the help file, although
it does seem there is some nuance that needs to be understood.

Well, one nuance is that not only do both sides need to actually *be*
strings or null, but the expressions themselves need to be of type
string (or null). So if you do:

object a = new string ("hello".ToCharArray());

if (a=="hello")

it will return false, because it's using object== instead of string==.
 
David Logan said:
I am writing this because I disagree with your statement "If you use ==
to compare String objects, you are comparing addresses, not the contents
of the strings."


David Logan
Absolutely. And if you look at my second post, I disagree with myself.
I apologize for the obvious error. I must have been thinking in Java at the
time :-(
Think first, type second...
think first, type second... (muttering to myself).
 
Thanks a lot! But culture-insensive string looks like
not appeared in my code.
Is there any method to capture the origin binary data
of a string, I think that might be helpful on catching the
error.
Here's my code below, the problem appears in this
statement: "if ( branch == state.branch_no && op_receive
== state.op_receive)".
Here are the results in the quick watch:
branch "1202"
state.branch "1202"
branch == state.branch true
op_receive "1202002"
state.op_receive "1202002"
op_receive==state.op_receive false

////////////////////////////////////////////////////
source code:
////////////////////////////////////////////////////


public class StateObject
{
// Size of receive buffer.
public const int BufferSize = 8192;
public bool connected = false; // ID received flag
// Client socket.
public Socket workSocket = null;
public byte[] buffer = new byte[BufferSize+1];
public StringBuilder sb = new StringBuilder
public string id = String.Empty;
public DateTime TimeStamp;
public int size = -9999;
public PacketHead head;
public int iBufferUsed = 0;
public string branch_no;
public string op_receive;
}

public void CallBack(string Info)
{
XmlDocument doc = new XmlDocument();
doc.LoadXml(Info);
string branch = doc.SelectSingleNode
("/Record/field[@name=\"branch_no\"]").InnerText.Trim();
string op_receive = doc.SelectSingleNode
("/Record/field[@name=\"op_receive\"]").InnerText.Trim();
if(doc.SelectSingleNode("/Record/field
[@name=\"send_status\"]").InnerText.Trim() != "1")
return;

try
{
ArrayList workSocketList = new ArrayList();
foreach (StateObject state in
connectedSocks)
workSocketList.Add(state);

foreach (StateObject state in
workSocketList)
{
if ( branch == state.branch_no &&
op_receive == state.op_receive)
/* if( string.Compare(branch, state.branch_no)==0 &&
string.Compare(op_receive,state.op_receive) == 0 ) */

{

byte[] bytedata = Encoding.Unicode.GetBytes
(Info);

byte[] answer = new Byte
[bytedata.Length+4];

Array.Copy(BitConverter.GetBytes
(bytedata.Length), answer, 4);

Array.Copy(bytedata, 0, answer, 4,
bytedata.Length);

Debug.WriteLine("Sent to: "+
state.branch_no+","+state.op_receive);

Send(state.workSocket, answer);

}
}
}
catch (Exception ex)
{
MessageBox.Show("CheckSockets:"+ex);
}
}
 
Jon said:
Well, one nuance is that not only do both sides need to actually *be*
strings or null, but the expressions themselves need to be of type
string (or null). So if you do:

object a = new string ("hello".ToCharArray());

if (a=="hello")

it will return false, because it's using object== instead of string==.

OK, I have it. operator== isn't a virtual/override from Object. Object
has no == operator, so by default it checks for referential equality,
which is unequal in most cases. If you use == from an actual string
class, it is overridden, does use the Equal() method, and so all is well.

So basically that means that using == is a safe way to compare strings
at least 99% of the time.

David Logan
 
Thanks a lot! But culture-insensive string looks like
not appeared in my code.
Is there any method to capture the origin binary data
of a string, I think that might be helpful on catching the
error.
Here's my code below, the problem appears in this
statement: "if ( branch == state.branch_no && op_receive
== state.op_receive)".
Here are the results in the quick watch:
branch "1202"
state.branch "1202"
branch == state.branch true
op_receive "1202002"
state.op_receive "1202002"
op_receive==state.op_receive false

Could you post a short but *complete* program which demonstrates the
problem?

See http://www.pobox.com/~skeet/csharp/complete.html for details of
what I mean by that.

We need to be able to reproduce the problem, really.
 
hi, my attempt of producing test code has failed at last.
Because the data was retreived from socket communication.
Is there any other solution?
I think to print out the binary data of two strings would
find out the way!
 
David zhu said:
hi, my attempt of producing test code has failed at last.
Because the data was retreived from socket communication.

Is there any chance that one of your strings had nul characters
(Unicode 0) when they shouldn't?
Is there any other solution?
I think to print out the binary data of two strings would
find out the way!

Indeed:

foreach (char c in theString)
{
Console.WriteLine ((int)c);
}

will dump out the contents of the string in a no-nonsense way.
 
Hi, excellent method you have post! Here's the results:

[2448] op_receive,49,
[2448] op_receive,50,
[2448] op_receive,48,
[2448] op_receive,50,
[2448] op_receive,48,
[2448] op_receive,48,
[2448] op_receive,50,
[2448] sate.op_receive,49,
[2448] sate.op_receive,50,
[2448] sate.op_receive,48,
[2448] sate.op_receive,50,
[2448] sate.op_receive,48,
[2448] sate.op_receive,48,
[2448] sate.op_receive,50,
[2448] sate.op_receive,0,
[2448] 1202002,1202002
[2448] True //use string.compare
[2448] False //use ==

It's time for you to make a conclution, I think, ^_^.
 
David zhu said:
Hi, excellent method you have post! Here's the results:

[2448] op_receive,49,
[2448] op_receive,50,
[2448] op_receive,48,
[2448] op_receive,50,
[2448] op_receive,48,
[2448] op_receive,48,
[2448] op_receive,50,
[2448] sate.op_receive,49,
[2448] sate.op_receive,50,
[2448] sate.op_receive,48,
[2448] sate.op_receive,50,
[2448] sate.op_receive,48,
[2448] sate.op_receive,48,
[2448] sate.op_receive,50,
[2448] sate.op_receive,0,
[2448] 1202002,1202002
[2448] True //use string.compare
[2448] False //use ==

It's time for you to make a conclution, I think, ^_^.

The same conclusion I reached before - there's a Unicode nul (0) at the
end of state.op_receive, but not at the end of op_receive. So, the next
question you need to ask yourself is how it got there...
 
Back
Top