StringBuilder

A

Alvin Bruney

Is string builder intelligent enough to handle concats without behaving like
string?

Consider
myStringBuilder.Append("one" + "two")

what does the '+' do here? Because this syntax is also legal but what is the
cost compared to
myStringBuilder.Append("one");
myStringBuilder.Append("two");

and is there a Microsoft recommended pattern for this?
 
G

Greg Ewing [MVP]

Alvin, your first example, myStringBuilder.Append("one" + "two") will
basically compile to a string concatenation and then the stringbuilder
append.

Overall doing an Append twice will be faster. If you are only doing it once
though it's not a huge difference.
 
K

Kieran Benton

I'm sure Jon Skeet took a look at the IL for this kind of thing a while ago
and noted the C# compiler is smart enought to make the "one" and "two"
constants "onetwo" at compile time. Of course this wont work if your soing
something at runtime.

HTH
Kieran
 
J

Jon Skeet

Kieran Benton said:
I'm sure Jon Skeet took a look at the IL for this kind of thing a while ago
and noted the C# compiler is smart enought to make the "one" and "two"
constants "onetwo" at compile time. Of course this wont work if your soing
something at runtime.

Indeed - and it's even specified in the C# specification, section
14.15, which governs what counts as a constant expression and then
states:

<quote>
Whenever an expression is of one of the types listed above and contains
only the constructs listed above, the expression is evaluated at
compile-time.
</quote>
 
T

TJoker .NET [MVP]

My two cents.
For a smal number of string operations, the StringBuilder may even be slower
than regular string concatenation. I don't know of any rule of thumb when
deciding which approach to use, but usually I resort to a StringBuilder when
I see concatenations inside a for/while loop or if massive text replacement
is being done in a long text.


--
TJoker, MCSD.NET
MVP: Paint, Notepad, Solitaire

****************************************
 
A

Alvin Bruney

well for starters, i got this kind of construct peppered all over my code. i
start off with good intentions with the append but there seems to always be
a piece of string needing to be added and not worth the starting a new line.
i know that is so lazy. always figured the language was smart enough to know
better.

i didn't care to read spec 14.15. what does it say in the queens english? do
i need to stop doing that (is really what i am asking)
 
M

Michael Mayer

But be sure to think code readability, as well. For example, this is a lot
more readable:

string name = lastname + ", " + firstname;

then the following

StringBuilder nameBuilder = new StringBuilder();
nameBuilder.Append (lastname);
nameBuilder.Append (", ");
nameBuilder.Append (firstname);
string name = nameBuilder.ToString();

Not to mention that the first is quicker to write and less error prone, and
quite possibly faster in execution.

I would even suggest that if you're concatenating 100,000 names, still only
use the string builder to put all the names together, something like:

StringBuilder nameBuilder = new StringBuilder();
foreach (DataRow row in myDataTable)
{
nameBuilder.Append (row[lastname] + ", " + row[firstname] +
Environment.NewLine);
}

instead of:

StringBuilder nameBuilder = new StringBuilder();
foreach (DataRow row in myDataTable)
{
nameBuilder.Append (row[lastname]);
nameBuilder.Append (", ");
nameBuilder.Append (row[firstname]);
nameBuilder.Append (Environment.NewLine);
}


I think this simplistic case would be fairly readable written either way,
but the first example keeps the name grouped together nicely, so if you add
an address, and other stuff, you don't need to split it all out with
comments:

//Add name
four
lines
to build
name

// Add Address
eight
lines of
code to
assemble
an address
from street, city state, etc

and on.

--
Mike Mayer
http://www.mag37.com/csharp/
(e-mail address removed)
 
J

Jay B. Harlow [MVP - Outlook]

Michael,
Just remember though that:
StringBuilder nameBuilder = new StringBuilder();
foreach (DataRow row in myDataTable)
{
nameBuilder.Append (row[lastname] + ", " + row[firstname] +
Environment.NewLine);
}
For a 100,000 names you just created 100,000 string objects that will need
to be garbage collected.

Where as:
StringBuilder nameBuilder = new StringBuilder();
foreach (DataRow row in myDataTable)
{
nameBuilder.Append (row[lastname]);
nameBuilder.Append (", ");
nameBuilder.Append (row[firstname]);
nameBuilder.Append (Environment.NewLine);
}
Will not create 100,000 string objects.

In both cases you should set the initial capacity on the StringBuilder
constructor as the internal buffer is going to be reallocated each time it
runs out of space. The buffer starts at 16 if nothing no value is given.
Normally you need to make an guess an approximate how much capacity you will
need, for example 100,000 names * (25 first name + 25 last name + 1 comma +
1 newline) = 5,200,000.

I normally avoid calling String.Concat (C# + or VB.NET & on strings) when
using a StringBuilder as part of the reason I use the StringBuilder is to
avoid any intermediate string objects on heap. Using String.Concat with
StringBuilder is just creating those objects...

Just a thought
Jay

Michael Mayer said:
But be sure to think code readability, as well. For example, this is a lot
more readable:

string name = lastname + ", " + firstname;

then the following

StringBuilder nameBuilder = new StringBuilder();
nameBuilder.Append (lastname);
nameBuilder.Append (", ");
nameBuilder.Append (firstname);
string name = nameBuilder.ToString();

Not to mention that the first is quicker to write and less error prone, and
quite possibly faster in execution.

I would even suggest that if you're concatenating 100,000 names, still only
use the string builder to put all the names together, something like:

StringBuilder nameBuilder = new StringBuilder();
foreach (DataRow row in myDataTable)
{
nameBuilder.Append (row[lastname] + ", " + row[firstname] +
Environment.NewLine);
}

instead of:

StringBuilder nameBuilder = new StringBuilder();
foreach (DataRow row in myDataTable)
{
nameBuilder.Append (row[lastname]);
nameBuilder.Append (", ");
nameBuilder.Append (row[firstname]);
nameBuilder.Append (Environment.NewLine);
}


I think this simplistic case would be fairly readable written either way,
but the first example keeps the name grouped together nicely, so if you add
an address, and other stuff, you don't need to split it all out with
comments:

//Add name
four
lines
to build
name

// Add Address
eight
lines of
code to
assemble
an address
from street, city state, etc

and on.

--
Mike Mayer
http://www.mag37.com/csharp/
(e-mail address removed)


Alvin Bruney said:
well for starters, i got this kind of construct peppered all over my
code.
i
start off with good intentions with the append but there seems to always be
a piece of string needing to be added and not worth the starting a new line.
i know that is so lazy. always figured the language was smart enough to know
better.

i didn't care to read spec 14.15. what does it say in the queens
english?
do
i need to stop doing that (is really what i am asking)

StringBuilder
when wrote
in what
 
M

Michael Mayer

Jay B. Harlow said:
Michael,
Just remember though that:
StringBuilder nameBuilder = new StringBuilder();
foreach (DataRow row in myDataTable)
{
nameBuilder.Append (row[lastname] + ", " + row[firstname] +
Environment.NewLine);
}
For a 100,000 names you just created 100,000 string objects that will need
to be garbage collected.

Ok, I agree (and I guess it's worse since there's three concats in the above
so it would probably result in 300,000 temporary strings). I haven't really
ever tried to do the above for such a big number of strings - I might make
an example to see how bad it is. Most of the data I've used would only try
to concat a few dozens of records at a time, where you can get by with
runtime ineffeciencies.

The main thing I try to avoid is copying of the entire string over and over
(as you'd get if you did text += newStuff a whole bunch of times). But there
certainly are advantages to avoiding the short strings when there will be a
lot of them.
I normally avoid calling String.Concat (C# + or VB.NET & on strings) when
using a StringBuilder as part of the reason I use the StringBuilder is to
avoid any intermediate string objects on heap. Using String.Concat with
StringBuilder is just creating those objects...

I guess I would have to agree, in retrospect, since a few hundred thousand
extra strings on the heap isn't optimal - when they can easily be avoided
altogether.
 
J

Jon Skeet

Alvin Bruney said:
well for starters, i got this kind of construct peppered all over my code. i
start off with good intentions with the append but there seems to always be
a piece of string needing to be added and not worth the starting a new line.
i know that is so lazy. always figured the language was smart enough to know
better.

Not sure exactly which construct you mean here, to be honest.
i didn't care to read spec 14.15. what does it say in the queens english? do
i need to stop doing that (is really what i am asking)

It says that "literal1"+"literal2" is always evaluated at compile-time
to "literal1literal2".
 
J

Jon Skeet

Michael Mayer said:
Ok, I agree (and I guess it's worse since there's three concats in the above
so it would probably result in 300,000 temporary strings).

Nope, there's only 100,000, actually, although there are also 100,000
string arrays created briefly. The line of:

String x = a + ',' + b + 'x';

is compiled to String.Concat (new object[] {a, ',', b, 'x'});
 
J

Jay B. Harlow [MVP - Outlook]

Michael,
Ok, I agree (and I guess it's worse since there's three concats in the above
so it would probably result in 300,000 temporary strings). I haven't
really
As Jon said there are only 100,000 temporary strings. Then 100,000 temporary
string arrays.

Interesting: String.Concat is overloaded for variable number of strings & a
variable number of objects. When overloading for a variable number of
parameters (params) its common to have overloads for a fixed number such as
1, 2 and 3, as it avoids the temporary array to hold the parameters in these
cases.

String.Concat can concatenate up to 4 strings before it needs to use the
params version and create the array. However can only concatenate up to 3
objects before it needs to use the params version. Interesting I would
expect both to been 3 or both 4. Hmm...

Hope this helps
Jay

Michael Mayer said:
Michael,
Just remember though that:
StringBuilder nameBuilder = new StringBuilder();
foreach (DataRow row in myDataTable)
{
nameBuilder.Append (row[lastname] + ", " + row[firstname] +
Environment.NewLine);
}
For a 100,000 names you just created 100,000 string objects that will need
to be garbage collected.

Ok, I agree (and I guess it's worse since there's three concats in the above
so it would probably result in 300,000 temporary strings). I haven't really
ever tried to do the above for such a big number of strings - I might make
an example to see how bad it is. Most of the data I've used would only try
to concat a few dozens of records at a time, where you can get by with
runtime ineffeciencies.

The main thing I try to avoid is copying of the entire string over and over
(as you'd get if you did text += newStuff a whole bunch of times). But there
certainly are advantages to avoiding the short strings when there will be a
lot of them.
I normally avoid calling String.Concat (C# + or VB.NET & on strings) when
using a StringBuilder as part of the reason I use the StringBuilder is to
avoid any intermediate string objects on heap. Using String.Concat with
StringBuilder is just creating those objects...

I guess I would have to agree, in retrospect, since a few hundred thousand
extra strings on the heap isn't optimal - when they can easily be avoided
altogether.
 
A

Alvin Bruney

Not sure exactly which construct you mean here, to be honest.
was talking about .append(blah + moreblah) as opposed to .append(blah);
..append(moreblah)
It says that "literal1"+"literal2" is always evaluated at compile-time
to "literal1literal2".

ok well that seems to be good for me then since it evaluates it at compile
time. but what happens at run-time? what if literal2 was a string object
(not interned) - as would frequently be the case. does it know what to do
then?

consider StringBuilder1.Append("my name is" + varName). Knowing this answer
is fundamentally important to me since I am building about 10,000 queries
that way and performance must always supercede anything else.
 
J

Jay B. Harlow [MVP - Outlook]

Alvin,
See my & Jon's responses to Michael Meyer.
consider StringBuilder1.Append("my name is" + varName). Knowing this answer
is fundamentally important to me since I am building about 10,000 queries
that way and performance must always supercede anything else.
You will have 10,000 temporary string objects on the heap that will need to
be collection. If you are concatenating more then 3 objects or 4 strings you
will have that many temporary object arrays on the heap, in addition to the
temporary string object.

Hope this helps
Jay

Alvin Bruney said:
Not sure exactly which construct you mean here, to be honest.
was talking about .append(blah + moreblah) as opposed to .append(blah);
.append(moreblah)
It says that "literal1"+"literal2" is always evaluated at compile-time
to "literal1literal2".

ok well that seems to be good for me then since it evaluates it at compile
time. but what happens at run-time? what if literal2 was a string object
(not interned) - as would frequently be the case. does it know what to do
then?

consider StringBuilder1.Append("my name is" + varName). Knowing this answer
is fundamentally important to me since I am building about 10,000 queries
that way and performance must always supercede anything else.

always
to
 
M

mikeb

Alvin said:
was talking about .append(blah + moreblah) as opposed to .append(blah);
.append(moreblah)




ok well that seems to be good for me then since it evaluates it at compile
time. but what happens at run-time? what if literal2 was a string object
(not interned) - as would frequently be the case. does it know what to do
then?

consider StringBuilder1.Append("my name is" + varName). Knowing this answer
is fundamentally important to me since I am building about 10,000 queries
that way and performance must always supercede anything else.

If the transient string objects that are created in calls like this are
an issue, there are a couple of ways to deal with the problem, while
keeping readability:

1) remember that StringBuilder.Append() returns the StringBuilder
instance. So you can make the above call like so:

StringBuilder1.Append( "my name is ").Append( varName);

whether this remains readable or not is a question of taste. It
does not create the transient string object, however.

2) write your own utility method that takes multiple strings to
append to a StringBuilder instance. Since StringBuilder is sealed, this
would likely be a static class - something like (untested code follows):

public class SbUtil {
public static StringBuilder Append( StringBuilder sb,
string s1,
string s2)
{
sb.Append( s1);
sb.Append( s2);
return( sb);
}
}

// ...

SbUtil.Append( StringBuilder1, "may name is ", varName);

code. i
be

line.

know


Not sure exactly which construct you mean here, to be honest.

english? do
 
M

mikeb

You might also want to test StringBuilder.AppendFormat() - it may be
smart enough to avoid the transient string object creation (probably
not, though).

If the transient string objects that are created in calls like this are
an issue, there are a couple of ways to deal with the problem, while
keeping readability:

1) remember that StringBuilder.Append() returns the StringBuilder
instance. So you can make the above call like so:

StringBuilder1.Append( "my name is ").Append( varName);

whether this remains readable or not is a question of taste. It
does not create the transient string object, however.

2) write your own utility method that takes multiple strings to
append to a StringBuilder instance. Since StringBuilder is sealed, this
would likely be a static class - something like (untested code follows):

public class SbUtil {
public static StringBuilder Append( StringBuilder sb,
string s1,
string s2)
{
sb.Append( s1);
sb.Append( s2);
return( sb);
}
}

// ...

SbUtil.Append( StringBuilder1, "may name is ", varName);
 
A

Alvin Bruney

i like this append().append format.

i'll give it a shot - it appears immediately readable to me by the way so
readability won't be an issue.

ordinarily, i don't suspect this sort of thing would be an everday issue but
i am building a mountain of strings so i thought i should take a second look
at memory allocation/deallocation.

thanks for the vigilance all.

mikeb said:
Alvin said:
was talking about .append(blah + moreblah) as opposed to .append(blah);
.append(moreblah)




ok well that seems to be good for me then since it evaluates it at compile
time. but what happens at run-time? what if literal2 was a string object
(not interned) - as would frequently be the case. does it know what to do
then?

consider StringBuilder1.Append("my name is" + varName). Knowing this answer
is fundamentally important to me since I am building about 10,000 queries
that way and performance must always supercede anything else.

If the transient string objects that are created in calls like this are
an issue, there are a couple of ways to deal with the problem, while
keeping readability:

1) remember that StringBuilder.Append() returns the StringBuilder
instance. So you can make the above call like so:

StringBuilder1.Append( "my name is ").Append( varName);

whether this remains readable or not is a question of taste. It
does not create the transient string object, however.

2) write your own utility method that takes multiple strings to
append to a StringBuilder instance. Since StringBuilder is sealed, this
would likely be a static class - something like (untested code follows):

public class SbUtil {
public static StringBuilder Append( StringBuilder sb,
string s1,
string s2)
{
sb.Append( s1);
sb.Append( s2);
return( sb);
}
}

// ...

SbUtil.Append( StringBuilder1, "may name is ", varName);
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top