string.Empty

  • Thread starter Thread starter Smithers
  • Start date Start date
S

Smithers

I have been told that it is a good idea to *always* declare string variables
with a default value of string.Empty - for cases where an initial value is
not known... like this:

string myString = string.Empty; // do this
string myString; // do not do this

Questions
1. Is that a good rule?
2. If so, why? If not, why not?

Thanks!
 
Smithers said:
I have been told that it is a good idea to *always* declare string variables
with a default value of string.Empty - for cases where an initial value is
not known... like this:

string myString = string.Empty; // do this
string myString; // do not do this

Questions
1. Is that a good rule?
No.

2. If so, why? If not, why not?

Generally speaking, if a rule says "always" and it involves restricting
you above and beyond what the language allows, it's not a good rule.

In this particular case, I see no reason at all that initializing string
variables with null is in appropriate in a variety of situations.

Initializing with string.Empty (or the equivalent) is desirable when you
want to always be able to assume the string is initialized to an
instance, and an empty string is a useful value. But to say that doing
do is _always_ the right thing to do is wrong.

Pete
 
Hi,


Smithers said:
I have been told that it is a good idea to *always* declare string
variables with a default value of string.Empty - for cases where an initial
value is not known... like this:

Not idea why you want to do that.
string myString = string.Empty; // do this
string myString; // do not do this


You know that those two lines are not the same right? String is a reference
type, that means that in the second line it will be unassigned (referencing
null for lack of a better way of saying).
The first line assign it to an instance. Now if you have a type where you
want to use as a default String.Empty then it makes sense.

But that is far from being *always*
 
Smithers said:
I have been told that it is a good idea to *always* declare string variables
with a default value of string.Empty - for cases where an initial value is
not known... like this:

string myString = string.Empty; // do this
string myString; // do not do this

Questions
1. Is that a good rule?

No rule that says you should *always* do something no matter what the
situation is ever a good rule. All rules have exceptions.
2. If so, why? If not, why not?

I have seen, though not often, scenarios where it was important to know
if the field was uninitialized or if it was set to a value of "".
Following the above rule, that distinction would no longer be possible.
 
What's going on is that I'm trying to learn best practices and guidelines.
Juval Lowy has published a document "C# Coding Standard - Guidelines and
Best Practices"

What it amounts to is a list of rules with no explanations. I wouldn't just
blindly follow the rules - especially suspicious of any that start out with
"[always | never ]do xyz".

His rule #55 states
"use string.Empty rather than""

So, yes - I botched the OP here because the rule talks about
string.Empty
vs.
""
as opposed to string.empty vs null (which I put in the OP).

Anyway I've heard this rule from Jesse Lilberty as well. So give that at
least two well published authors speak of it, I wanted to understand the
reasoning behind it. Thus the OP here.

-S
 
Smithers said:
What's going on is that I'm trying to learn best practices and guidelines.
Juval Lowy has published a document "C# Coding Standard - Guidelines and
Best Practices"

What it amounts to is *a list of rules with no explanations*. I wouldn't just
blindly follow the rules - especially suspicious of any that start out with
"[always | never ]do xyz".

Maybe it's just a pet peeve of mine or something, but I tend to avoid
authors who tell the reader they *should* do something without providing
any explanation as to why.

For curiosity's sake though, maybe someone could mention what the
practical difference is between:

string s = string.Empty;

and

string s = "";

I am of course, assuming it's generally understood that a definition
without assignment will be null.


Chris.
 
Chris said:
Smithers said:
What's going on is that I'm trying to learn best practices and
guidelines. Juval Lowy has published a document "C# Coding Standard -
Guidelines and Best Practices"

What it amounts to is *a list of rules with no explanations*. I
wouldn't just blindly follow the rules - especially suspicious of any
that start out with "[always | never ]do xyz".

Maybe it's just a pet peeve of mine or something, but I tend to avoid
authors who tell the reader they *should* do something without providing
any explanation as to why.

For curiosity's sake though, maybe someone could mention what the
practical difference is between:

string s = string.Empty;

and

string s = "";

I am of course, assuming it's generally understood that a definition
without assignment will be null.

As strings are interned, a reference to string.Empty would be the same
as a reference to "", so I can't see that there is any difference from
that standpoint. After that, it looks like just coding style to me.
 
Hi,

Tom Porterfield said:
Chris said:
Smithers said:
What's going on is that I'm trying to learn best practices and
guidelines. Juval Lowy has published a document "C# Coding Standard -
Guidelines and Best Practices"

What it amounts to is *a list of rules with no explanations*. I wouldn't
just blindly follow the rules - especially suspicious of any that start
out with "[always | never ]do xyz".

Maybe it's just a pet peeve of mine or something, but I tend to avoid
authors who tell the reader they *should* do something without providing
any explanation as to why.

For curiosity's sake though, maybe someone could mention what the
practical difference is between:

string s = string.Empty;

and

string s = "";

I am of course, assuming it's generally understood that a definition
without assignment will be null.

As strings are interned, a reference to string.Empty would be the same as
a reference to "", so I can't see that there is any difference from that
standpoint. After that, it looks like just coding style to me.

I agree, just a matter of style. Note that it can be make clear by looking
to the code generated with both expressions.
 
Smithers said:
[...]
So, yes - I botched the OP here because the rule talks about
string.Empty
vs.
""
as opposed to string.empty vs null (which I put in the OP).

Anyway I've heard this rule from Jesse Lilberty as well. So give that at
least two well published authors speak of it, I wanted to understand the
reasoning behind it. Thus the OP here.

I discovered recently that in spite of string pooling, the literal ""
and String.Empty are not equivalent. I didn't explore it any further to
see if there was a way I could get "" and string.Empty to be pooled to
the same string, but the default behavior didn't appear to.

Since as far as I know, the string represented by string.Empty will
always exist, you can avoid having a new empty string added to your
string constant pool by always using that instead of "". However, I
doubt there's much different beyond that. As long as string pooling is
enabled, the worst that using "" instead of string.Empty should cause is:

1) the addition of a single extra string in the string constant pool

2) very slightly slower comparisons between one empty string and
another when they aren't using the same version of the empty string (I'm
assuming here that Equals() for strings checks for reference equality
first, so if the constants are the same reference, this would shortcut
the need to compare length and contents, speeding things very slightly).

In other words, I can't imagine that there's any truly significant
difference between the two. They aren't literally the same, at least in
some situations, but you are unlikely to ever notice the difference in a
real-world application.

Pete
 
You know that those two lines are not the same right? String is a reference
type, that means that in the second line it will be unassigned (referencing
null for lack of a better way of saying).

The variable being unassigned and the variable having an assigned value
of null are very different things.

If you do:

string myString;

for a *member* variable, the variable is assigned the value null. If
you do it for a *local* variable, the variable isn't assigned at all -
its value isn't null, it *has* no value yet, as far as the compiler is
concerned.
 
Tom Porterfield said:
As strings are interned, a reference to string.Empty would be the same
as a reference to "", so I can't see that there is any difference from
that standpoint. After that, it looks like just coding style to me.

Nope, string.Empty and "" won't be the same reference, at least they're
not guaranteed to be. The code below prints False on my box:

using System;

public class Test
{
static void Main()
{
object x = "";
object y = string.Empty;
Console.WriteLine (x==y);
}
}

I wouldn't like to say why, given the IL for the static constructor of
String, but there we go.


Personally I prefer "" to string.Empty - I find it clearer. I've never
seen any good reasons presented for using string.Empty instead.
 
Peter said:
I discovered recently that in spite of string pooling, the literal ""
and String.Empty are not equivalent. I didn't explore it any further to
see if there was a way I could get "" and string.Empty to be pooled to
the same string, but the default behavior didn't appear to.

Hmmm, this appears to be true. The following:

string a = string.Empty;
string b = "";
string c = string.Empty;
string d = "";

Console.WriteLine(object.ReferenceEquals(a, b));
Console.WriteLine(object.ReferenceEquals(a, c));
Console.WriteLine(object.ReferenceEquals(b, d));

Yields the results:

False
True
True
Since as far as I know, the string represented by string.Empty will
always exist, you can avoid having a new empty string added to your
string constant pool by always using that instead of "". However, I
doubt there's much different beyond that. As long as string pooling is
enabled, the worst that using "" instead of string.Empty should cause is:

1) the addition of a single extra string in the string constant pool

2) very slightly slower comparisons between one empty string and
another when they aren't using the same version of the empty string (I'm
assuming here that Equals() for strings checks for reference equality
first, so if the constants are the same reference, this would shortcut
the need to compare length and contents, speeding things very slightly).

In other words, I can't imagine that there's any truly significant
difference between the two. They aren't literally the same, at least in
some situations, but you are unlikely to ever notice the difference in a
real-world application.

Agreed.
 
Jon said:
Nope, string.Empty and "" won't be the same reference, at least they're
not guaranteed to be. The code below prints False on my box:

using System;

public class Test
{
static void Main()
{
object x = "";
object y = string.Empty;
Console.WriteLine (x==y);
}
}

Yep, did a small test, though slightly different, and confirmed the same
(see reply to Peter Duniho).
I wouldn't like to say why, given the IL for the static constructor of
String, but there we go.

Hehe. I'll poke around in it when I get a chance.
 
Yes, that is a very different question.

I would agree and I always use String.Empty rather than "" in this case. In
theory at least, String.Empty is an existing string while "" is a new one.
And so using String.Empty will result in one less string declaration (even
if it contains no characters).

That said, it's hard to always know what is going on internally within the
compiler and libraries. It is not inconceivable that the compiler could
automatically detect and then replace "" with String.Empty, although I do
not recommend assuming that behavior.

--
Jonathan Wood
SoftCircuits Programming
http://www.softcircuits.com

Smithers said:
What's going on is that I'm trying to learn best practices and guidelines.
Juval Lowy has published a document "C# Coding Standard - Guidelines and
Best Practices"

What it amounts to is a list of rules with no explanations. I wouldn't
just blindly follow the rules - especially suspicious of any that start
out with "[always | never ]do xyz".

His rule #55 states
"use string.Empty rather than""

So, yes - I botched the OP here because the rule talks about
string.Empty
vs.
""
as opposed to string.empty vs null (which I put in the OP).

Anyway I've heard this rule from Jesse Lilberty as well. So give that at
least two well published authors speak of it, I wanted to understand the
reasoning behind it. Thus the OP here.

-S







Tom Porterfield said:
No rule that says you should *always* do something no matter what the
situation is ever a good rule. All rules have exceptions.


I have seen, though not often, scenarios where it was important to know
if the field was uninitialized or if it was set to a value of "".
Following the above rule, that distinction would no longer be possible.
 
Jonathan Wood said:
Yes, that is a very different question.

I would agree and I always use String.Empty rather than "" in this case. In
theory at least, String.Empty is an existing string while "" is a new one.

"" is only new the very first time it's used in an assembly.
And so using String.Empty will result in one less string declaration (even
if it contains no characters).

That said, it's hard to always know what is going on internally within the
compiler and libraries. It is not inconceivable that the compiler could
automatically detect and then replace "" with String.Empty, although I do
not recommend assuming that behavior.

You can and *should* assume that string literals are interned, as it's
part of the C# specification. It's not limited to empty strings. If you
have the code:

string x = "hello";
string y = "he"+"llo";
Console.WriteLine(object.ReferenceEquals(x,y));

then the spec guarantees that "True" will be printed out.


Quite why it's not happening for string.Empty vs "" isn't clear at the
moment, but in the general case it's fine.

Now, the performance hit of introducing a single new string *once* is
tiny - so what are the other benefits, if any, of using string.Empty
instead of ""? It looks less clear to me... any of the string.Empty
advocates like to make a case for it?
 
Jon,
string x = "hello";
string y = "he"+"llo";
Console.WriteLine(object.ReferenceEquals(x,y));

then the spec guarantees that "True" will be printed out.

Wow, sure enough. I didn't realize that.

I wonder how much overhead is associated with that feature. For every value
assigned to a String, it sounds like the libs looks up a list of all
existing strings to see if there's a match.
Now, the performance hit of introducing a single new string *once* is
tiny - so what are the other benefits, if any, of using string.Empty
instead of ""? It looks less clear to me... any of the string.Empty
advocates like to make a case for it?

As a long-time C/C++ and assembly language programmer, tiny performance hits
are something I care about for the simple reason that you can end up with
hundreds or even thousands of them in a substantial application. In fact,
one thing that really bothers me about .NET is a lot of the extra overhead
by many parts of how it is designed.

That said, I'm not sure there would be any performance hit in this case
given the information you gave above. I can imagine there could be some tiny
bit of additional memory used. Other than that, it sounds like the library
would simply look to see if "" is already a string and, if so, the string
wouldn't be used any further.
 
Jon Skeet said:
Nope, string.Empty and "" won't be the same reference, at least they're
not guaranteed to be. The code below prints False on my box:

using System;

public class Test
{
static void Main()
{
object x = "";
object y = string.Empty;
Console.WriteLine (x==y);
}
}

True but....

string x = "";
string y = string.Empty;
Console.WriteLine (x==y);

will output true, "strings" are interned "objects" are not......


Willy.
 
Willy Denoyette said:
True but....

string x = "";
string y = string.Empty;
Console.WriteLine (x==y);

will output true, "strings" are interned "objects" are not......

Indeed - but the point of the exercise was to check for equal
references, not equal objects. I think we can all agree that
string.Empty and "" will always refer to equal strings.
 
Jonathan Wood said:
Wow, sure enough. I didn't realize that.

I wonder how much overhead is associated with that feature. For every value
assigned to a String, it sounds like the libs looks up a list of all
existing strings to see if there's a match.

No - only every string constant, and that only needs to be done once.
The JIT compiler checks whether or not the string has already been
interned when it compiles a ldstr instruction.

In other words, there's very little overhead associated with it.
As a long-time C/C++ and assembly language programmer, tiny performance hits
are something I care about for the simple reason that you can end up with
hundreds or even thousands of them in a substantial application. In fact,
one thing that really bothers me about .NET is a lot of the extra overhead
by many parts of how it is designed.

In this case, the overhead is one-off and really, really tiny.

Even in C/C++ and assembly, the old adage about not micro-optimising
until you know where there's a problem holds true.
That said, I'm not sure there would be any performance hit in this case
given the information you gave above. I can imagine there could be some tiny
bit of additional memory used. Other than that, it sounds like the library
would simply look to see if "" is already a string and, if so, the string
wouldn't be used any further.

No - otherwise the examples we've looked at so far wouldn't have shown
"" and string.Empty to be different references.
 
Jon,
No - only every string constant, and that only needs to be done once.
The JIT compiler checks whether or not the string has already been
interned when it compiles a ldstr instruction.

So the point is you're saying it's compile time overhead. I didn't know "he"
+ "llo" was a constant but that's certainly something that a compiler could
be made to figure out and handle. And, in fact, I see your examples returns
False if I change it like this:

string x = "hello";
string y = "he";
y += "llo";
MessageBox.Show(object.ReferenceEquals(x, y).ToString());
Even in C/C++ and assembly, the old adage about not micro-optimising
until you know where there's a problem holds true.

That's an issue for another discussion.
No - otherwise the examples we've looked at so far wouldn't have shown
"" and string.Empty to be different references.

Well, that certainly would explain it, although there could potentially be
other explanations.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top