String and StringBuilder

H

Hardy Wang

Hi all,
I know it is better to handle large string with a StringBuilder, but how
does StringBuilder class improve the performance in the background?

Thanks!
 
N

Nicholas Paldino [.NET/C# MVP]

Hardy,

It works by preallocating a buffer which is added to. Instead of
allocating memory for the new string on concatenations, it has a buffer
pre-allocated which is written to. When the buffer is full, it then
reallocates the buffer to be twice as big, and copies the old buffer to the
new one.

Hope this helps.
 
H

Hardy Wang

Thanks, it makes sense to me!

Nicholas Paldino said:
Hardy,

It works by preallocating a buffer which is added to. Instead of
allocating memory for the new string on concatenations, it has a buffer
pre-allocated which is written to. When the buffer is full, it then
reallocates the buffer to be twice as big, and copies the old buffer to
the new one.

Hope this helps.


--
- Nicholas Paldino [.NET/C# MVP]
- (e-mail address removed)

Hardy Wang said:
Hi all,
I know it is better to handle large string with a StringBuilder, but
how does StringBuilder class improve the performance in the background?

Thanks!
 
G

Greg Young

In the background? Not sure I am understanding you here.

The reason StringBuilder is generally more performant is because strings are
immutable.

string b = "";
for(int i=0;i<100;i++) {
b+="hello";
}
return b;
vs

StringBuilder b = new StringBuilder(InitialSize);
for(int i=0;i<100;i++) {
b.Append("hello");
}
return b.ToString();

each of the operations in the first loop create a new string as strings are
immutable (they cannot (well atleast in the safe world) be changed)... the
string builder continues using the same buffer. Since the string builder
uses the same object over and over (alterring an internal buffer) it will
use far less memory than the string example. The major speed gain is
realized in garbage collection and in the copying of memory ...

Creating objects in .NET is very fast, it is getting rid of them that is
slow .. since the first example creates so many intermediate objects the
garbage collector has alot of work to do to get rid of them.

For copying of memory the string example is basically copying the data for
the entire size of the string everytime a new string is created (for small
data sets this is rather minor but for big strings it gets quite expensive).
The StringBuilder class will also copy data if it has to expand its buffer
but will generally only have to copy the data that is actually being
appended (as opposed to all of the data).

Cheers,

Greg Young
MVP - C#
http://codebetter.com/blogs/gregyoung
 
M

Michael Nemtsev

Hello Hardy,

Just to add to Nicholas, default buffer allocated by StringBuilder is for
16 symbols, afaik, after u used its size SB double its

HW> Hi all,
HW> I know it is better to handle large string with a StringBuilder,
HW> but how
HW> does StringBuilder class improve the performance in the background?
HW> Thanks!
HW>
---
WBR,
Michael Nemtsev :: blog: http://spaces.msn.com/laflour

"At times one remains faithful to a cause only because its opponents do not
cease to be insipid." (c) Friedrich Nietzsch
 
I

Ignacio Machin \( .NET/ C# MVP \)

Hi,

That is not true, StringBuilder is intended to be used in constructing the
string, usually by concatenation ( Append ).
String is a "special" class , its a very used data type that is inmutable,
this mean that it cannot be changed, once it's created any modification will
create a new instance (and possible discard the old one). Meaning that you
may get a lot of instances in certain cirscuntances, like in a loop:
string s;
foreach( DataRow dr in dataset.Tables[0].Rows)
s = d + dr["column1"].ToString();

in the above code each time d is appened a new instance is created with the
needed size, the old one is copied first, then the second string is copied ,
later it's assigned and the old instance is marked to GCed . If you sum the
space needed you will see that in some moment you are using twice the memory
needed, one half in the old instance, the other in the new instance. Hope
you understand what is happening.

ITOH with the code:
StringBuilder sb = new StringBuilder();
foreach( DataRow dr in dataset.Tables[0].Rows)
sb.Append(dr["column1"].ToString() );

there is no need to copy over the previous value, it just "add" the new
string. Now the next question is how this add is done, but it either
preallocate a buffer and when it's full reallocated a bigger buffer, or just
keep a concatenate list of buffers and when one is filled it just add
another.
 
H

Harvey Triana

Greg, a question...
When loop times is low, say

for(int i=0;i<8;i++) {
b+="hello";
}

should use the class StringBuilder? ... also

<HT />
 
N

Nicholas Paldino [.NET/C# MVP]

Harvey,

In this SPECIFIC case, no, since you know you have eight iterations of
the loop and you know the string you will be appending to the end result.

However, if you didn't know that "hello" was the string, and you didn't
know how many times through the loop you were iterating, then yes, a
StringBuilder would be best.
 
I

Ignacio Machin \( .NET/ C# MVP \)

Hi,

Harvey Triana said:
Greg, a question...
When loop times is low, say

for(int i=0;i<8;i++) {
b+="hello";
}

should use the class StringBuilder? ... also

Do you mean if you should use it when you only have just a few iterations?

It's debatable, if you look in the archives you will see several thread
about this. The results depend of the strings, the initial buffer and the
number of iterations.

IMO you should always use StringBuilder in a loop.
 
M

Michael Nemtsev

Hello Harvey,

I've posted the sample about 2 month ago, comparing SB and String concantinating.
The difference is start visible from concantinating about 25,000 strings
up to this number of thousand strings the significance is unsignificant (everything
depends on you environment)

HT> Greg, a question...
HT> When loop times is low, say
HT> for(int i=0;i<8;i++) {
HT> b+="hello";
HT> }
HT> should use the class StringBuilder? ... also
HT>
HT> <HT />
HT>
HT> "Greg Young" <[email protected]> escribió en el
HT> mensaje HT>---
WBR,
Michael Nemtsev :: blog: http://spaces.msn.com/laflour

"At times one remains faithful to a cause only because its opponents do not
cease to be insipid." (c) Friedrich Nietzsch
 
M

Michael Nemtsev

Hello Hardy,


HW> Hi all,
HW> I know it is better to handle large string with a StringBuilder,
HW> but how
HW> does StringBuilder class improve the performance in the background?
HW> Thanks!
HW>
---
WBR,
Michael Nemtsev :: blog: http://spaces.msn.com/laflour

"At times one remains faithful to a cause only because its opponents do not
cease to be insipid." (c) Friedrich Nietzsch
 
G

Greg Young

I would like to see this sample showing the difference is insignificant
until 25000 strings as this is in fact not the case ...

public static void TenThousandWithString() {
string s = "";
for (int i = 0; i < 10000; i++) {
s += i.ToString();
}
DummyString = s;

}

public static void TenThousandWithStringBuilder() {
StringBuilder s = new StringBuilder(50000);
for (int i = 0; i < 10000; i++) {
s.Append(i.ToString());
}
DummyString = s.ToString();
}

for 1000 runs ...

Test : String took 190539874570.188 ns, average ns = 190539874.570188
Test : StringBuilder took 3698780828.64676 ns, average ns = 3698780.82864676

I would say that being 60 times faster is fairly noticable wouldn't you?

at 1000 concats its still 5 times faster... and these are small strings ...
with bigger strings the speed difference increases.

Cheers,

Greg Young
MVP - C#
http://codebetter.com/blogs/gregyoung
 
G

Greg Young

There is a break even point when it is passed the string may actually be
slightly faster than the stringbuilder (these are generally very small
numbers like 6 (and when being compared to s tringbuilder without an initial
size as it has to grow its internal buffer which equates to almost exactly
what is happenning in the case of string)

Generally I tell people to stay away from this code unless they are _really_
optimizing an area as often times requirements will change slightly and the
stringbuilder will become faster.

Cheers,

Greg Young
MVP - C#
http://codebetter.com/blogs/gregyoung
 
M

Michael Nemtsev

Hello Greg,

There http://groups.google.com/group/micr.../f94bf27c8cbc912e?hl=en&#doc_6d8fc6a2b950ddad

but a bit bigger number of string.

I'm not declaring that SB doesn't hinder the performance. It really hinders,
but we need to understand where this hindering is important.
We can concantinate 10,000 string with SB for 0.09sec and this doesn hurts
us in one app, and use SB to concat 6 string where performance will be hurted
severe.
Everything depends on your context
I'm for the simplicity if we are not struggle for the additional couple of
secs

GY> I would like to see this sample showing the difference is
GY> insignificant until 25000 strings as this is in fact not the case
GY> ...
GY>
GY> public static void TenThousandWithString() {
GY> string s = "";
GY> for (int i = 0; i < 10000; i++) {
GY> s += i.ToString();
GY> }
GY> DummyString = s;
GY> }
GY>
GY> public static void TenThousandWithStringBuilder() {
GY> StringBuilder s = new StringBuilder(50000);
GY> for (int i = 0; i < 10000; i++) {
GY> s.Append(i.ToString());
GY> }
GY> DummyString = s.ToString();
GY> }
GY> for 1000 runs ...
GY>
GY> Test : String took 190539874570.188 ns, average ns =
GY> 190539874.570188 Test : StringBuilder took 3698780828.64676 ns,
GY> average ns = 3698780.82864676
GY>
GY> I would say that being 60 times faster is fairly noticable wouldn't
GY> you?
GY>
GY> at 1000 concats its still 5 times faster... and these are small
GY> strings ... with bigger strings the speed difference increases.
GY>
GY> Cheers,
GY>
GY> Greg Young
GY> MVP - C#
GY> http://codebetter.com/blogs/gregyoung
GY> GY>---
WBR,
Michael Nemtsev :: blog: http://spaces.msn.com/laflour

"At times one remains faithful to a cause only because its opponents do not
cease to be insipid." (c) Friedrich Nietzsche
 
G

Greg Young

I guess you don't find it important until it takes a full second as opposed
to < 100 ms? This is already atleast 10 times faster and can make a HUGE
difference in the running time of your app as this is a CPU bound task (it
uses 100% CPU on a CPU while doing this which means if you do this alot your
CPU usage will needlessly go up).

Your code also only runs it once which completely ignores the GC penalty
incurred by the string methodology. Which my guess would make it closer to
15-20 times slower.
and use SB to concat 6 string where performance will be hurted severe.

The penalty for using a SB is NEVER severe if used properly (unless you are
concating say 2 strings) .. the penalty is only severe for using strings ...
that is why it is best to error on the side of caution and use a
stringbuilder.

Cheers,

Greg Young
MVP - C#
http://codebetter.com/blogs/gregyoung
 
H

Harvey Triana

Oh thanks, this is a good discussion. Suggest another question:

When i write:
return IntegerPart + " " + MoneyName + " " & RealPart;

- how may strings are create before Return is a single string?

Should be use ScringBuilder in this case?

<HT />
 
J

Jon Skeet [C# MVP]

Michael Nemtsev said:
I've posted the sample about 2 month ago, comparing SB and String
concantinating. The difference is start visible from concantinating
about 25,000 strings up to this number of thousand strings the
significance is unsignificant (everything depends on you environment)

That depends on what you mean by "significant". The difference between
using StringBuilder and string becomes detectable over the course of
several iterations of the whole operation *much* earlier than that.
StringBuilder becomes orders of magnitude faster than string
concatenation at relatively small numbers of concatenations.

While it's true that the overall operation is still fast, if you're
doing enough concatenation (even in smallish batches) StringBuilder is
still much faster.
 
J

Jon Skeet [C# MVP]

Harvey Triana said:
Oh thanks, this is a good discussion. Suggest another question:

When i write:
return IntegerPart + " " + MoneyName + " " & RealPart;

- how may strings are create before Return is a single string?

No more than would be created using StringBuilder, and no StringBuilder
instance is required.
Should be use ScringBuilder in this case?

Absolutely not.

See http://www.pobox.com/~skeet/csharp/stringbuilder.html
 
G

Greg Young

the answer here is (n -1) ... 4 ... since these strings are all short its
not too big of a deal (the copies are of say 8 bytes as opposed to 800 or
8000) :)

I think in a case like this you are fine to just use strings when it gets
much bigger is when you run into problems (i.e. if you see a loop you have a
problem:)).

If this is a place where micro-optimization is very important to you just
remember to measure measure measure .... and when you are done measure some
more.

Cheers,

Greg Young
MVP - C#
http://codebetter.com/blogs/gregyoung
 
G

Greg Young

Jon,

Just to be formal ...

There may be more objects created than with the stringbuilder it really
depends on what the variables come out as and the initial size of the
stringbuilder object (although I agree with you that the stringbuilder in
bad code here).

Cheers,

Greg
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top