Problem with .net and Strings

T

Travis Ellis

I am having a problem with .net Strings.

Encoding ascii = new ASCIIEncoding();
string foo = ascii.GetString( new byte[]{ 0, 0, 0, 0 } ).Trim();
if( foo == "" )
{
Console.WriteLine("Empty"); // This never happens
}

I have noticed that when you create a string using the null characters
(byte)0 or '\u0000' that it doesn't trim the whitespace.
My example above would be the string "" but it would have a .Length of 4 so
it is not comparing right with the == operator.
If I do this:

if( foo.CompareTo("") == 0 )
{
Console.WriteLine("Empty"); //this works
}

I noticed that I can use CompareTo to compare against an empty string when I
create the string from a byte or char array that might be 0 filled.
Anybody know why this is happening?

I can do a similar thing in Java and not have the same problem:

String foo = new String( new byte[]{ 0, 0, 0, 0 } ).trim();
System.out.println(foo.length); // this prints 0 b/c the string is empty


Yet in C#.net it would print 4, but it would still appear that foo is "" by
using CompareTo
 
J

Jon Skeet [C# MVP]

Travis Ellis said:
I am having a problem with .net Strings.

Encoding ascii = new ASCIIEncoding();
string foo = ascii.GetString( new byte[]{ 0, 0, 0, 0 } ).Trim();
if( foo == "" )
{
Console.WriteLine("Empty"); // This never happens
}

Sure. The null character isn't treated as a whitespace character in
..NET.
I have noticed that when you create a string using the null characters
(byte)0 or '\u0000' that it doesn't trim the whitespace.
My example above would be the string "" but it would have a .Length of 4 so
it is not comparing right with the == operator.
If I do this:

if( foo.CompareTo("") == 0 )
{
Console.WriteLine("Empty"); //this works
}

I noticed that I can use CompareTo to compare against an empty string when I
create the string from a byte or char array that might be 0 filled.
Anybody know why this is happening?

I can do a similar thing in Java and not have the same problem:

String foo = new String( new byte[]{ 0, 0, 0, 0 } ).trim();
System.out.println(foo.length); // this prints 0 b/c the string is empty

Yet in C#.net it would print 4, but it would still appear that foo is "" by
using CompareTo

Java's String.trim() method treats all characters which have Unicode
values of 32 or less as whitespace; .NET has a somewhat stricter
definition.

If you want to trim nulls as well, you can the form of String.Trim
which takes a character array (as a params argument) eg:

string x = x.Trim(' ', '\0');
 
C

C# Learner

Travis Ellis said:
Encoding ascii = new ASCIIEncoding();
string foo = ascii.GetString( new byte[]{ 0, 0, 0, 0 } ).Trim();

if( foo.CompareTo("") == 0 )
{
Console.WriteLine("Empty"); //this works
}

This is weird. The following prints 'This will display.'.

using System;
using System.Text;

class Test
{
static void Main() {
byte[] bytes = new byte[] { 65, 0, 0, 0, 0 };
string s = Encoding.ASCII.GetString(bytes);

if ("A".CompareTo(s) == 0) {
Console.WriteLine("This will display.");
}
if ("A" == s) {
Console.WriteLine("This won't.");
}

Console.Read();
}
}

Changing the byte array declaration to

byte[] bytes = new byte[] { 65, 0, 65, 0, 0 };

stops that from happening, however. But then this expression evaluates to
true:

"AA".CompareTo(s) == 0

And so does this:

"\0AA".CompareTo(s) == 0

Conclusion: String.CompareTo, using my current culture, trims null
characters from both strings before performing the comparison, even though
this doesn't seem to be documented.
 
P

Phil Wilson

Complete speculation on my part, but I think it's more likely that it just
uses the data in chunks of 16 bits and ignores that dangling 8 bits of null
that it can't do anything with anyway.
--
Phil Wilson
[Microsoft MVP-Windows Installer]

C# Learner said:
Travis Ellis said:
Encoding ascii = new ASCIIEncoding();
string foo = ascii.GetString( new byte[]{ 0, 0, 0, 0 } ).Trim();

if( foo.CompareTo("") == 0 )
{
Console.WriteLine("Empty"); //this works
}

This is weird. The following prints 'This will display.'.

using System;
using System.Text;

class Test
{
static void Main() {
byte[] bytes = new byte[] { 65, 0, 0, 0, 0 };
string s = Encoding.ASCII.GetString(bytes);

if ("A".CompareTo(s) == 0) {
Console.WriteLine("This will display.");
}
if ("A" == s) {
Console.WriteLine("This won't.");
}

Console.Read();
}
}

Changing the byte array declaration to

byte[] bytes = new byte[] { 65, 0, 65, 0, 0 };

stops that from happening, however. But then this expression evaluates to
true:

"AA".CompareTo(s) == 0

And so does this:

"\0AA".CompareTo(s) == 0

Conclusion: String.CompareTo, using my current culture, trims null
characters from both strings before performing the comparison, even though
this doesn't seem to be documented.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top