Performance of StringBuilder

F

Franz

string strContent = String.Empty;
foreach (string strLine in Content) // Content is a ArrayList which contains
a large amount of strings
strContent += strLine;

StringBuilder aStrBuilder = new StringBuilder();
foreach (string strLine in aStrBuilder)
aStrBuilder.Append(strLine);
strContent = aStrBuilder.ToString();

Which method is faster?
Is there any article which talks about the performance of StringBuilder in
deep?
Thanks.
 
P

Peter Strøiman

For a large amount of strings the StringBuilder is by far the fastest.

The reason is that the string class only allocates anough memory to hold the
string. Thus every time you append a string, new memory has to be created
and the entire content has to be moved.
The string builder allocates a big chunk of memory at once so it is prepared
to append new strings.

/Pete
 
D

Dmitriy Lapshin [C# / .NET MVP]

Hi,

The StringBuilder approach is faster and it is the recommended approach for
constructing strings dynamically. What's wrong with concatenating strigs
with the += operator is that the "string" type is immutable. In other words,
every time the following line of code is executed:

strContent += strLine

a new string instance is created and the old "strContent" instance is
discarded. This leads to memory re-allocations, hence performance penalty.

If you can predict the length of the resultant sting, construct a
StringBuilder instance by specifying the expected length as a constructor
argument. Thus you will avoid unnecessary memory re-allocations that could
happen otherwise, as the StringBuilder needs to accomodate its internal
buffer to store the result.
 
F

Franz

Thanks.
I understand now.

Dmitriy Lapshin said:
Hi,

The StringBuilder approach is faster and it is the recommended approach for
constructing strings dynamically. What's wrong with concatenating strigs
with the += operator is that the "string" type is immutable. In other words,
every time the following line of code is executed:

strContent += strLine

a new string instance is created and the old "strContent" instance is
discarded. This leads to memory re-allocations, hence performance penalty.

If you can predict the length of the resultant sting, construct a
StringBuilder instance by specifying the expected length as a constructor
argument. Thus you will avoid unnecessary memory re-allocations that could
happen otherwise, as the StringBuilder needs to accomodate its internal
buffer to store the result.

--
Dmitriy Lapshin [C# / .NET MVP]
X-Unity Test Studio
http://x-unity.miik.com.ua/teststudio.aspx
Bring the power of unit testing to VS .NET IDE

Franz said:
string strContent = String.Empty;
foreach (string strLine in Content) // Content is a ArrayList which contains
a large amount of strings
strContent += strLine;

StringBuilder aStrBuilder = new StringBuilder();
foreach (string strLine in aStrBuilder)
aStrBuilder.Append(strLine);
strContent = aStrBuilder.ToString();

Which method is faster?
Is there any article which talks about the performance of StringBuilder in
deep?
Thanks.
 
T

Tommy Carlier

If you want higher performance, you shouldn't use foreach: foreach is
really slow. The fastest possible code I can think of that does what
you want is:

StringBuilder aStrBuilder = new StringBuilder();
int ContentCount = Content.Count;
for(int i = 0; i < ContentCount; i++)
aStrBuilder.Append(Content);
strContent = aStrBuilder.ToString();
 
J

Jon Skeet [C# MVP]

Tommy Carlier said:
If you want higher performance, you shouldn't use foreach: foreach is
really slow.

No it's not - where did you get that impression? In 1.0 foreach was
slow when it was iterating over the characters in a string, but that's
much better in 1.1. For *some* types using foreach is slower than the
code you showed, but it's far from universally true.

What *might* make a difference, however, would be getting the
StringBuilder size right to start with. It would certainly be worth
*trying* iterating over the loop and finding the total length before
constructing the StringBuilder, then iterating again to append the
actual values. I don't know whether or not it would *actually* be
faster though - in cases with strings which got increasingly large, it
quite possibly would be.
 
D

Dmitriy Lapshin [C# / .NET MVP]

Jon,

The belief that "foreach" is slower than "for" is, I suppose, based on the
following: "foreach" utilizes IEnumerator interface, while "for" indexes a
collection directly. Since "foreach" requires additional work such as
obtaining a reference to the collection's IEnumerator and calling MoveNext
on the reference, it might indeed be slower than direct indexer access
followed by a simple integer variable increment.

--
Dmitriy Lapshin [C# / .NET MVP]
X-Unity Test Studio
http://x-unity.miik.com.ua/teststudio.aspx
Bring the power of unit testing to VS .NET IDE
 
J

Jon Skeet [C# MVP]

Dmitriy Lapshin said:
The belief that "foreach" is slower than "for" is, I suppose, based on the
following: "foreach" utilizes IEnumerator interface, while "for" indexes a
collection directly. Since "foreach" requires additional work such as
obtaining a reference to the collection's IEnumerator and calling MoveNext
on the reference, it might indeed be slower than direct indexer access
followed by a simple integer variable increment.

By the time everything's been inlined by the JIT compiler, there's
unlikely to be any significant difference, at least for strings and
arrays. The idea that foreach is "really slow" is certainly inaccurate
- I'd want to see some actual *evidence* of it to start with!
 
A

Alvin Bruney

Yes it is significantly slower.

There is inherent overhead in the foreach:
the foreach unwraps into a finally construct at compile time which
must be executed upon loop completion. That call involves a stack allocation
and a dispose call.

A for loop is a basic indexer access O(1). It does not have a dispose call
or
the overhead of a method access for the IEnumerable item. You can argue, and
I will concede, that
optimization is only advantageous for array-like data structures, since the
normal
arrays are optimized in the IL anyway, but it does count for a solid couple
of milliseconds or a whopping 55% defficiency on my system.

In this demo, my screen shows a 55% difference.

Disclaimer: I've got a monster machine so I don't expect your numbers to
even be in the same ballpark but the difference should be the same.

FOR 0.04489313
FOREACH 0.09888659

namespace Loops
{
class Tester
{
[STAThread]
static void Main(string[] args)
{
int count = 2000000;

// creation and initialization of the array
System.Collections.ArrayList list = new
System.Collections.ArrayList(count);
for (int i = 0; i < count; i++) list.Add(i);

int tempVar = 0;

Console.Write("FOR ");
Start();
for (int i = 0; i < list.Count; i++)
{
// do something with list
tempVar = (int) list;
}
Console.WriteLine(Stop());

Console.Write("FOREACH ");
Start();
foreach (int i in list)
{
// do something with i
tempVar = i;
}
Console.WriteLine(Stop());

Console.ReadLine();
}

[System.Runtime.InteropServices.DllImport("KERNEL32")] private static
extern bool QueryPerformanceCounter(ref long lpPerformanceCount);
[System.Runtime.InteropServices.DllImport("KERNEL32")] private static
extern bool QueryPerformanceFrequency(ref long lpFrequency);

private static long startCount = 0;

private static void Start()
{
startCount = 0;
QueryPerformanceCounter(ref startCount);
}

private static float Stop()
{
long stopCount = 0;
long frequency = 0;

QueryPerformanceCounter(ref stopCount);
QueryPerformanceFrequency(ref frequency);

return( (float) (stopCount - startCount) / (float) frequency);
}
}
}
//code complements Alberto Falossi

cache the list.Count variable in the for loop and now, the difference is
approaching 70%.
That's very signficant to say the least.

The benchmark adhers to this guys guidlines :)
http://www.yoda.arachsys.com/csharp/benchmark.html
 
D

Daniel O'Connell

Alvin Bruney said:
Yes it is significantly slower.

There is inherent overhead in the foreach:
the foreach unwraps into a finally construct at compile time which
must be executed upon loop completion. That call involves a stack allocation
and a dispose call.

The call to dispose is only a short call and in *most* cases is probably a
simple operation or noop. An enumerator isn't required to implement
IDisposable and the end result should be imperceptable in most cases.
However, out of curioisty, I removed the finally check and the performance
differences were negligible(couldn't have picked out with dispose vs without
from the results alone).
The call to MoveNext very well may be most of the issue.
A for loop is a basic indexer access O(1). It does not have a dispose call
or
the overhead of a method access for the IEnumerable item. You can argue, and
I will concede, that
optimization is only advantageous for array-like data structures, since the
normal
arrays are optimized in the IL anyway, but it does count for a solid couple
of milliseconds or a whopping 55% defficiency on my system.

In this demo, my screen shows a 55% difference.

Disclaimer: I've got a monster machine so I don't expect your numbers to
even be in the same ballpark but the difference should be the same.

FOR 0.04489313
FOREACH 0.09888659

namespace Loops
{
class Tester
{
[STAThread]
static void Main(string[] args)
{
int count = 2000000;

// creation and initialization of the array
System.Collections.ArrayList list = new
System.Collections.ArrayList(count);
for (int i = 0; i < count; i++) list.Add(i);

int tempVar = 0;

Console.Write("FOR ");
Start();
for (int i = 0; i < list.Count; i++)
{
// do something with list
tempVar = (int) list;
}
Console.WriteLine(Stop());

Console.Write("FOREACH ");
Start();
foreach (int i in list)
{
// do something with i
tempVar = i;
}
Console.WriteLine(Stop());

Console.ReadLine();
}

[System.Runtime.InteropServices.DllImport("KERNEL32")] private static
extern bool QueryPerformanceCounter(ref long lpPerformanceCount);
[System.Runtime.InteropServices.DllImport("KERNEL32")] private static
extern bool QueryPerformanceFrequency(ref long lpFrequency);

private static long startCount = 0;

private static void Start()
{
startCount = 0;
QueryPerformanceCounter(ref startCount);
}

private static float Stop()
{
long stopCount = 0;
long frequency = 0;

QueryPerformanceCounter(ref stopCount);
QueryPerformanceFrequency(ref frequency);

return( (float) (stopCount - startCount) / (float) frequency);
}
}
}
//code complements Alberto Falossi

cache the list.Count variable in the for loop and now, the difference is
approaching 70%.
That's very signficant to say the least.

The benchmark adhers to this guys guidlines :)
http://www.yoda.arachsys.com/csharp/benchmark.html


To throw a bit in real quick, you should always benchmark for & foreach for
performance critical code(if you feel it is important), instead of blindly
assuming foreach is slower. In more complicated cases, an enumerator may be
written in such a way that will perform access faster than an
indexer(perhaps by skipping certain verifications or events, or whatever).
Also, if you switch to using an array instead of an array list the
results[1] are nearly equal, with foreach usually winning on my machine (it
appears foreach works like a for loop in this case, apparently capable of
beating a normal for loop). Then, with the coming of iterators the face of
enumeration performance may well change considerably.
Another thing to beware of is foreach may return results in a different
order than for. Although code usually shouldn't be so fragile that it breaks
in such a situation, such issues need to be considered. Also consider the
inherent readbility differences.
Also, in any situation where .05 seconds per 2 million operations is
important, then there probably is reason to drop managed code in entirety
and move back to straight native code.

1. Some of my results using an int[] instead of an ArrayList:
//foreach wins
FOR 0.007166618
FOREACH 0.005370928
//foreach wins
FOR 0.006275462
FOREACH 0.005444145
//for wins
FOR 0.005783775
FOREACH 0.006185961
//foreach wins
FOR 0.005751621
FOREACH 0.005480257

 
A

Alvin Bruney

To throw a bit in real quick, you should always benchmark for & foreach for
performance critical code(if you feel it is important), instead of blindly
assuming foreach is slower
well said

Ya, It's well worth your while to benchmark instead of going on hearsay for
performance critical applications. I agree wholeheartedly.

--
Regards,
Alvin Bruney
Got DotNet? Get it here
http://home.networkip.net/dotnet/tidbits/default.htm
Daniel O'Connell said:
Alvin Bruney said:
Yes it is significantly slower.

There is inherent overhead in the foreach:
the foreach unwraps into a finally construct at compile time which
must be executed upon loop completion. That call involves a stack allocation
and a dispose call.

The call to dispose is only a short call and in *most* cases is probably a
simple operation or noop. An enumerator isn't required to implement
IDisposable and the end result should be imperceptable in most cases.
However, out of curioisty, I removed the finally check and the performance
differences were negligible(couldn't have picked out with dispose vs without
from the results alone).
The call to MoveNext very well may be most of the issue.
A for loop is a basic indexer access O(1). It does not have a dispose call
or
the overhead of a method access for the IEnumerable item. You can argue, and
I will concede, that
optimization is only advantageous for array-like data structures, since the
normal
arrays are optimized in the IL anyway, but it does count for a solid couple
of milliseconds or a whopping 55% defficiency on my system.

In this demo, my screen shows a 55% difference.

Disclaimer: I've got a monster machine so I don't expect your numbers to
even be in the same ballpark but the difference should be the same.

FOR 0.04489313
FOREACH 0.09888659

namespace Loops
{
class Tester
{
[STAThread]
static void Main(string[] args)
{
int count = 2000000;

// creation and initialization of the array
System.Collections.ArrayList list = new
System.Collections.ArrayList(count);
for (int i = 0; i < count; i++) list.Add(i);

int tempVar = 0;

Console.Write("FOR ");
Start();
for (int i = 0; i < list.Count; i++)
{
// do something with list
tempVar = (int) list;
}
Console.WriteLine(Stop());

Console.Write("FOREACH ");
Start();
foreach (int i in list)
{
// do something with i
tempVar = i;
}
Console.WriteLine(Stop());

Console.ReadLine();
}

[System.Runtime.InteropServices.DllImport("KERNEL32")] private static
extern bool QueryPerformanceCounter(ref long lpPerformanceCount);
[System.Runtime.InteropServices.DllImport("KERNEL32")] private static
extern bool QueryPerformanceFrequency(ref long lpFrequency);

private static long startCount = 0;

private static void Start()
{
startCount = 0;
QueryPerformanceCounter(ref startCount);
}

private static float Stop()
{
long stopCount = 0;
long frequency = 0;

QueryPerformanceCounter(ref stopCount);
QueryPerformanceFrequency(ref frequency);

return( (float) (stopCount - startCount) / (float) frequency);
}
}
}
//code complements Alberto Falossi

cache the list.Count variable in the for loop and now, the difference is
approaching 70%.
That's very signficant to say the least.

The benchmark adhers to this guys guidlines :)
http://www.yoda.arachsys.com/csharp/benchmark.html


To throw a bit in real quick, you should always benchmark for & foreach for
performance critical code(if you feel it is important), instead of blindly
assuming foreach is slower. In more complicated cases, an enumerator may be
written in such a way that will perform access faster than an
indexer(perhaps by skipping certain verifications or events, or whatever).
Also, if you switch to using an array instead of an array list the
results[1] are nearly equal, with foreach usually winning on my machine (it
appears foreach works like a for loop in this case, apparently capable of
beating a normal for loop). Then, with the coming of iterators the face of
enumeration performance may well change considerably.
Another thing to beware of is foreach may return results in a different
order than for. Although code usually shouldn't be so fragile that it breaks
in such a situation, such issues need to be considered. Also consider the
inherent readbility differences.
Also, in any situation where .05 seconds per 2 million operations is
important, then there probably is reason to drop managed code in entirety
and move back to straight native code.

1. Some of my results using an int[] instead of an ArrayList:
//foreach wins
FOR 0.007166618
FOREACH 0.005370928
//foreach wins
FOR 0.006275462
FOREACH 0.005444145
//for wins
FOR 0.005783775
FOREACH 0.006185961
//foreach wins
FOR 0.005751621
FOREACH 0.005480257
 
J

Jon Skeet [C# MVP]

Yes it is significantly slower.

I was surprised at just how much slower your results *did* show it to
be.
There is inherent overhead in the foreach:
the foreach unwraps into a finally construct at compile time which
must be executed upon loop completion. That call involves a stack
allocation and a dispose call.

That's a single call though (assuming the enumerator even implements
IDisposable), which means it will become less and less significant as
the number of iterations goes up - that doesn't tie in with the results
of your code (where increasing the count changes the result
proportionally, pretty much). It also doesn't tie in with the results
for strings and arrays, where foreach *isn't* significantly slower.
(See in a second)

<snip>

I changed the ArrayList to an array in your code, and there was no
significant difference in execution speed between the foreach and the
for.

I then did the same with iterating over a string's characters using the
indexer vs using foreach - and again, there was no difference in speed.

So, it's not that foreach itself is slow - it's that foreach over an
ArrayList is relatively slow. I suspect this is because it tries to
detect concurrent modifications. Just for kicks, I wrote a wrapper
around ArrayList that returns an enumerator which *doesn't* check for
anything - the results were then significantly better than before,
although still not quite as fast as the straight for loop. I'm not
entirely sure where the remaining difference comes from - possibly the
two checks for being within the bounds of the ArrayList (one in the
enumerator, one in the indexer) aren't being reduced to one. I'm half
tempted to write a version of ArrayList which does really minimal
bounds checking, just to see whether that would get it up to array
speed... but I cna't be bothered just now :)

Even with normal ArrayList behaviour, it's only twice as slow as the
list iteration part - that's *relatively* slow, but in most cases it's
unlikely to be slow compared with the bit inside the loop. In the
particular case mentioned, the inside of the loop would be appending
strings, and I suspect that unless the strings were very short, the
time taken by the foreach would dwarf the list iteration time.
Disclaimer: I've got a monster machine so I don't expect your numbers to
even be in the same ballpark but the difference should be the same.

Actually my results were twice as fast, but there we go. I have a
monster laptop :)
 
P

Pete Davis

I have to say, I'm surprised by the following:

1: Nobody's benchmarked all these performance issues people keep bringing up
and posted it somewhere with sample code and details of why it is the way it
is.

2: That so many people are so concerned with performance details. Maybe it's
just me, but in very few cases has performance been an issue for me. All
this talk about stringbuilder vs. string and for vs. foreach.

I've written a great deal of code in the past two years in C#. I generally
use string, even when I'm building up large strings out of many parts. I use
foreach at every opportunity. My machines run from 550MHZ to 1.7GHZ, so
fairly middle of the road or low end, and I've yet to have to change my code
in this regard to make it faster.

Maybe I'm just more patient than I used to be, but performance just hasn't
been a big issue in any of my code for a long time.

Pete
 
A

Alvin Bruney

That's great. Really it is. But consider that some of us have requirements
that are entirely different to yours. Some users require performance even
before functionality. They simply will not tolerate mediciocrity. Otherwise,
they curse the software, call it evil and take their contract somewhere
else. At times like these, you definitely need to go after performance.

Going after performance depends on your requirements mostly. Other times,
its just that you want to best yourself by writing the slickest, fastest,
smallest, meanest, most performant code. That last part is for programmer
junkies. These are the ones who are obsessed with performance. These are the
ones who bother to time control structures. These are the cream of the crop.
The ones who make the big bucks. The ones that cannot sleep because there's
still some blood left in the stone.

ok ok, i got a little carried away there.

--
Regards,
Alvin Bruney
Got DotNet? Get it here
http://home.networkip.net/dotnet/tidbits/default.htm
Pete Davis said:
I have to say, I'm surprised by the following:

1: Nobody's benchmarked all these performance issues people keep bringing up
and posted it somewhere with sample code and details of why it is the way it
is.

2: That so many people are so concerned with performance details. Maybe it's
just me, but in very few cases has performance been an issue for me. All
this talk about stringbuilder vs. string and for vs. foreach.

I've written a great deal of code in the past two years in C#. I generally
use string, even when I'm building up large strings out of many parts. I use
foreach at every opportunity. My machines run from 550MHZ to 1.7GHZ, so
fairly middle of the road or low end, and I've yet to have to change my code
in this regard to make it faster.

Maybe I'm just more patient than I used to be, but performance just hasn't
been a big issue in any of my code for a long time.

Pete
 
J

Jon Skeet [C# MVP]

Pete Davis said:
I have to say, I'm surprised by the following:

1: Nobody's benchmarked all these performance issues people keep bringing up
and posted it somewhere with sample code and details of why it is the way it
is.

I've seen numerous posts with benchmarks in these newsgroups, and
similar things on web pages. When it comes to string vs StringBuilder,
the difference is so clear when you go above a few thousand
concatenations that timing isn't worth doing.

For a web site which does the kind of thing you're after but for Java
(where StringBuffer is almost identical to StringBuilder) see
http://www.pobox.com/~skeet/java/stringbuffer.html
2: That so many people are so concerned with performance details. Maybe it's
just me, but in very few cases has performance been an issue for me. All
this talk about stringbuilder vs. string and for vs. foreach.

The foreach vs for argument I'd agree with you about - because it
really doesn't matter even if it *is* twice as slow. string vs
StringBuilder is rather different though, because the speed loss
becomes exponential.
I've written a great deal of code in the past two years in C#. I generally
use string, even when I'm building up large strings out of many parts.

How many parts? Do you know how many in advance? If you're reading
things from a file, you can often easily suddenly get a file which is
much larger than a previous one - and when your code suddenly goes from
running in seconds to running in hours, that's unpleasant.
I use
foreach at every opportunity. My machines run from 550MHZ to 1.7GHZ, so
fairly middle of the road or low end, and I've yet to have to change my code
in this regard to make it faster.

Maybe I'm just more patient than I used to be, but performance just hasn't
been a big issue in any of my code for a long time.

And that's generally a good attitude - tinkering around with little
things isn't worth doing generally. However, I make an exception for
string vs StringBuilder for anything other than toy code, unless I know
some kind of upper limit on the number of strings I'm concatenating.

For instance, concatenating 10,000 strings on my box takes about a 20th
of a second - but as soon as you go up to 100,000, it takes 18 seconds.
(That's only concatenating a space at a time.) Using StringBuilder for
the same test takes about a 60th of a second for 100,000. The
consequences of that kind of performance difference would make a *huge*
difference practically wherever it came up. Even if it's unlikely I'll
ever get to 100,000 strings, the effect on performance isn't worth the
slight effort it takes to use StringBuilder instead.
 
J

Jon Skeet [C# MVP]

Going after performance depends on your requirements mostly. Other times,
its just that you want to best yourself by writing the slickest, fastest,
smallest, meanest, most performant code. That last part is for programmer
junkies. These are the ones who are obsessed with performance. These are the
ones who bother to time control structures. These are the cream of the crop.
The ones who make the big bucks. The ones that cannot sleep because there's
still some blood left in the stone.

ok ok, i got a little carried away there.

More than just carried away - I seriously hope you don't believe that
performance junkie == good programmer. For most issues, the extra
tweaks you can put in will have a minimal effect on performance -
anything linear (even doubling the performance of that section of code)
isn't worth worrying about until you know you've got a problem. Good
programmers (IMO) value simplicity and readability *way* higher than
raw performance before there's a problem.

String vs StringBuilder is one exception, again IMO, because of the way
that performance difference manifests itself. If the user asks a
program to load a large file, say one that's ten times as large as the
previous one, they're unlikely to mind if it takes ten times as long.
They *will* mind if the trend is exponential and suddenly they're
waiting minutes or hours for a file to load. for vs foreach though?
Don't sweat it until you've found which of the many, many loops in your
code is actually the one which is the bottleneck.
 
A

Alvin Bruney

performance junkie == good programmer

I happen to believe that. Here is why. Going after performance implies that
you were a good programmer, otherwise you'd have no base to improve on
performance. You'd lack the theory. You'd lack the coding technique. You'd
lack the fundamental knowledge about optimization to tackle that kind of
thing. You wouldn't know where to start.

It's like saying fighter pilot == a good pilot. How can you sit in a test
cockpit harness if you can't fly well. You have to be a good programmer
first, then you graduate into a performance junkie. How do you expect to
performance tune if you weren't a hot shot to begin with? That's the way i
see it.

It really doesn't matter to me if I'm writing code for the defense
department or for the science school team. The habits ingrained in me force
me to find the most efficient way to write code. So i will replace
stringbuilders with strings (don't quite believe the stringbuilder hype) and
i will not normally write foreach loops instead of for loops. It doesn't
necessarily make my code always run faster than yours, i do it because it's
a good habit to write performant code. I do it because i believe it pays
divends later, but mostly out of habit. Readability is pretty much a given
these days. This is not the days of C++ when a pointer could leave the
screen and walk on the desktop.
 
J

Jon Skeet [C# MVP]

performance junkie == good programmer

I happen to believe that. Here is why. Going after performance implies that
you were a good programmer, otherwise you'd have no base to improve on
performance. You'd lack the theory. You'd lack the coding technique. You'd
lack the fundamental knowledge about optimization to tackle that kind of
thing. You wouldn't know where to start.

On the contrary - lots of people go after performance not realising
that:

a) By concentrating more on the design in the first place than the
details later on, they could probably have gained more performance
anyway.

b) By concentrating on simple, readable code instead of code which runs
very quickly, they're likely to have fewer bugs.

c) Going after performance where it's not needed is just a waste of
effort.

Being a performance junkie isn't the same as being concerned about
performance where necessary - the latter suggests that you understand
that not every piece of code *does* need to be optimised, because most
of your code won't end up as a bottleneck.
It's like saying fighter pilot == a good pilot. How can you sit in a test
cockpit harness if you can't fly well. You have to be a good programmer
first, then you graduate into a performance junkie.

That's not the way it happens though - people try to go after
performance when they really don't need to, and indeed shouldn't due to
the loss of readability which is often involved when going for really
fast code.
It really doesn't matter to me if I'm writing code for the defense
department or for the science school team. The habits ingrained in me force
me to find the most efficient way to write code.

In that case you won't be writing the most readable code. There's
almost always a trade-off between performance and readability, even if
it's only a fairly slight one.
So i will replace
stringbuilders with strings (don't quite believe the stringbuilder hype) and
i will not normally write foreach loops instead of for loops. It doesn't
necessarily make my code always run faster than yours, i do it because it's
a good habit to write performant code. I do it because i believe it pays
divends later, but mostly out of habit. Readability is pretty much a given
these days. This is not the days of C++ when a pointer could leave the
screen and walk on the desktop.

Readability is certainly *not* a given. Readability varies *enormously*
between programmers, and I believe that things like foreach improve
readability greatly - they very efficiently state that they're
iterating through the elements in a sequence of some description.

In my view, readability should be number one on the list of a
programmer's priorities - even before getting the code working
properly! It's easy to get from code which doesn't work but is readable
to code which does work but remains readable - far more so than getting
code which is messy but works to code which is readable and still
works. The same kind of thing is true for performance - get it working
simply and reliably first, then find out where it's not performing,
then fix that bit of the code if you really need to, after measuring
appropriately.

The order in which I'd develop in an ideal world would go something
like:

1) Write class specification, bearing in mind architectural
performance (often more of a system issue than an individual class
issue)
2) Write class documentation and empty method stubs
3) Write unit testing code
4) Write first pass simple implementation
5) Get that implementation working against all the tests
6) Run system tests (developed in parallel by a separate team,
probably) including performance measurements
7) If system doesn't perform adequately, isolate bottleneck and
optimise
 
D

Daniel O'Connell

Alvin Bruney said:
performance junkie == good programmer

I happen to believe that. Here is why. Going after performance implies that
you were a good programmer, otherwise you'd have no base to improve on
performance. You'd lack the theory. You'd lack the coding technique. You'd
lack the fundamental knowledge about optimization to tackle that kind of
thing. You wouldn't know where to start.

It's like saying fighter pilot == a good pilot. How can you sit in a test
cockpit harness if you can't fly well. You have to be a good programmer
first, then you graduate into a performance junkie. How do you expect to
performance tune if you weren't a hot shot to begin with? That's the way i
see it.

It really doesn't matter to me if I'm writing code for the defense
department or for the science school team. The habits ingrained in me force
me to find the most efficient way to write code. So i will replace
stringbuilders with strings (don't quite believe the stringbuilder hype) and
i will not normally write foreach loops instead of for loops. It doesn't
necessarily make my code always run faster than yours, i do it because it's
a good habit to write performant code. I do it because i believe it pays
divends later, but mostly out of habit. Readability is pretty much a given
these days. This is not the days of C++ when a pointer could leave the
screen and walk on the desktop.

Downside, of course, is not everyone wants to work with hotshots. They tend
to not be as good as they think they are.
To use your analogy, a fighter pilot may be a great fighter pilot...doesn't
mean I nessecerily want him flying consumer planes, especially if he doesn't
know when to stop flying like a fighter pilot. Any programmer that is so
overly attentive about performance that things like clarity, simplicity, and
security are secondary is not someone I would want on any team I was on, he
will likely be more of a detriment than anything else.
To be a good programmer, not just a performance tuner, requires alot more
than just good performance.
--
Regards,
Alvin Bruney
Got DotNet? Get it here
http://home.networkip.net/dotnet/tidbits/default.htm
are
 
A

Alvin Bruney

This could go on and on. I'll agree to disagree with you all.
Downside, of course, is not everyone wants to work with hotshots.
hmmm, that's where you learn to bend a language backward and forward. You
absolutely want to work with hotshots, if only to become a hotshot yourself.
To be a good programmer, not just a performance tuner, requires alot more
than just good performance.
I'll say here that you call in the performance tuners AFTER the good
programmers have written the code and it is working correctly. That's when
you bring in the heavy artillery.

but still, you've made some agreeable points.

nuff said.
--
Regards,
Alvin Bruney
Got DotNet? Get it here
http://home.networkip.net/dotnet/tidbits/default.htm
Daniel O'Connell said:
Alvin Bruney said:
performance junkie == good programmer

I happen to believe that. Here is why. Going after performance implies that
you were a good programmer, otherwise you'd have no base to improve on
performance. You'd lack the theory. You'd lack the coding technique. You'd
lack the fundamental knowledge about optimization to tackle that kind of
thing. You wouldn't know where to start.

It's like saying fighter pilot == a good pilot. How can you sit in a test
cockpit harness if you can't fly well. You have to be a good programmer
first, then you graduate into a performance junkie. How do you expect to
performance tune if you weren't a hot shot to begin with? That's the way i
see it.

It really doesn't matter to me if I'm writing code for the defense
department or for the science school team. The habits ingrained in me force
me to find the most efficient way to write code. So i will replace
stringbuilders with strings (don't quite believe the stringbuilder hype) and
i will not normally write foreach loops instead of for loops. It doesn't
necessarily make my code always run faster than yours, i do it because it's
a good habit to write performant code. I do it because i believe it pays
divends later, but mostly out of habit. Readability is pretty much a given
these days. This is not the days of C++ when a pointer could leave the
screen and walk on the desktop.

Downside, of course, is not everyone wants to work with hotshots. They tend
to not be as good as they think they are.
To use your analogy, a fighter pilot may be a great fighter pilot...doesn't
mean I nessecerily want him flying consumer planes, especially if he doesn't
know when to stop flying like a fighter pilot. Any programmer that is so
overly attentive about performance that things like clarity, simplicity, and
security are secondary is not someone I would want on any team I was on, he
will likely be more of a detriment than anything else.
To be a good programmer, not just a performance tuner, requires alot more
than just good performance.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top