It's Better CLEAR or NEW ?

pamela fluente · Sep 3, 2007

I would like to hear your *opinion and advice* on best programming
practice under .NET.

Given that several time we cannot change:

MyCollection.Clear into the instantiation of a NEW MyCollection

because we make orphan some needed reference,

I notice also that several times I can "equivalently" decide whether a
collection
(or any other class) be cleared and reused or a NEW one should be
created.

This usually happens with collections having a class scope.

In such cases, where it is functionally "equivalent", and I can
choose, what is more advisable ?

It's better to clear a collection and reuse it. Or just instantiate a
brand new one leaving the old
one to the GC ?

I mean to the purpose of memory usage, speed, GC, etc.

-P

Patrick Steele · Sep 3, 2007

I would like to hear your *opinion and advice* on best programming
practice under .NET.

Given that several time we cannot change:

MyCollection.Clear into the instantiation of a NEW MyCollection

because we make orphan some needed reference,

I notice also that several times I can "equivalently" decide whether a
collection
(or any other class) be cleared and reused or a NEW one should be
created.

This usually happens with collections having a class scope.

In such cases, where it is functionally "equivalent", and I can
choose, what is more advisable ?

It's better to clear a collection and reuse it. Or just instantiate a
brand new one leaving the old
one to the GC ?

I mean to the purpose of memory usage, speed, GC, etc.

Just create a new one (assuming the elements of your collection don't
hold on to any unmanaged resources).

As you develop your app, keep testing it for your desired performance
benchmarks (speed, memory usage, etc...) Wait until you start to
actually see a problem to fix it.

Hilton · Sep 3, 2007

Patrick said:
As you develop your app, keep testing it for your desired performance
benchmarks (speed, memory usage, etc...) Wait until you start to
actually see a problem to fix it.

I couldn't disgree more, but that's just me. My goal is to build in
performance and quality, not test it in. Then again, I write primarily for
the Pocket PC and Smartphone, so I really have to think about performance.
However, 'testing-in' performance and/or quality IMHO is not the right way.

Pamela, why don't you run some tests (just a few lines) and report back to
the group which was faster and by how much. Also, if this clear/new is in
an inner loop, it seems logical that clear would be better, but ignoring the
memory allocation and garbage collection piece of the puzzle, the speed diff
would be good info.

Thanks,

Hilton

Jon Skeet [C# MVP] · Sep 3, 2007

Hilton said:
I couldn't disgree more, but that's just me. My goal is to build in
performance and quality, not test it in. Then again, I write primarily for
the Pocket PC and Smartphone, so I really have to think about performance.
However, 'testing-in' performance and/or quality IMHO is not the right way.

I'm with Patrick. There's little point in micro-optimising *all* the
code (and often making it less readable at the same point) when
bottlenecks are often in unexpected places.

See

http://en.wikipedia.org/wiki/Optimization_(computer_science)#Quotes

pamela fluente · Sep 3, 2007

I'm with Patrick. There's little point in micro-optimising *all* the
code (and often making it less readable at the same point) when

Well we want both MACRO and micro-Optimize our code ! Do we ?

)

Actually this is not a minor issue because this is a task that occurs
several times into
programs and if there is a raccomended way I would tend to go that
way.

In case you are implying that CLEAR would be " less readable ", I can
say that, for instance,
to be really safe, I would always recommend to anyon to use CLEAR and
not NEW because, from my
experience the second way can be *very* dangerous, generally speaking.

You *really* must know what you are doing and watch carefully when you
indulge in a NEW
for a collection with class scope, and requires experience to stay out
of troubles.
Clearly I am talking of complex projects: in simple examples or
projects usually these issues do not appear....

I wish to hear the different opinions of yours on this matter, on the
basis of your
personal experience, especially on big programs ...

thanks

-P

pamela fluente · Sep 3, 2007

Well we want both MACRO and micro-Optimize our code ! Do we ? )

Actually this is not a minor issue because this is a task that occurs
several times into
programs and if there is a raccomended way I would tend to go that
way.

In case you are implying that CLEAR would be " less readable ", I can
say that, for instance,
to be really safe, I would always recommend to anyon to use CLEAR and
not NEW because, from my
experience the second way can be *very* dangerous, generally speaking.

You *really* must know what you are doing and watch carefully when you
indulge in a NEW
for a collection with class scope, and requires experience to stay out
of troubles.
Clearly I am talking of complex projects: in simple examples or
projects usually these issues do not appear....

I wish to hear the different opinions of yours on this matter, on the
basis of your
personal experience, especially on big programs ...

thanks

-P

- Mostra testo tra virgolette -

Ok I have done a few quick tests.

As to speed only, It comes out that the CLEAR way is 2 times faster
(!).

hmm, that's seems almost surprising, actually.

-P

Jon Skeet [C# MVP] · Sep 3, 2007

pamela fluente said:
Well we want both MACRO and micro-Optimize our code ! Do we ? )

It's worth optimising *some* things: making web service interfaces
bulky instead of chatty, looking at the complexity of algorithms in
big-O notation etc. That's a far cry from trying to make every bit of
code as fast as it can possibly be just *in case* it becomes a
bottleneck.

Actually this is not a minor issue because this is a task that occurs
several times into programs and if there is a raccomended way I would
tend to go that way.

How often is "several times"? There are a lot of things which may occur
"several times" during the lifecycle of an application, but still not
contribute significantly to performance.

In case you are implying that CLEAR would be " less readable ", I can
say that, for instance, to be really safe, I would always recommend
to anyon to use CLEAR and not NEW because, from my
experience the second way can be *very* dangerous, generally speaking.

The problem is that it really *is* "generally speaking". In my view it
entirely depends on the context.

You *really* must know what you are doing and watch carefully when you
indulge in a NEW for a collection with class scope, and requires experience
to stay out of troubles.
Clearly I am talking of complex projects: in simple examples or
projects usually these issues do not appear....

In either case you need to know what you're doing. Sometimes the
collections will be visible elsewhere, sometimes they won't be.
Sometimes there's a performance impact, sometimes there isn't. It's one
of those decisions I'd look at on a case-by-case basis rather than
trying to come up with a hard and fast rule.

I wish to hear the different opinions of yours on this matter, on the
basis of your personal experience, especially on big programs ...

My personal experience is that the right answer varies by context.

Jon Skeet [C# MVP] · Sep 3, 2007

Ok I have done a few quick tests.

As to speed only, It comes out that the CLEAR way is 2 times faster
(!).

hmm, that's seems almost surprising, actually.

But have you tested whether it's *relevant* that clearing the
collection is faster? Is this even *slightly* significant in the
overall performance of your application? If not, use whichever code is
clearer.

Peter Duniho · Sep 3, 2007

Hilton said:
I couldn't disgree more, but that's just me. My goal is to build in
performance and quality, not test it in. Then again, I write primarily for
the Pocket PC and Smartphone, so I really have to think about performance.

Everyone has to think about performance, regardless of platform.
Everyone should write code that performs well.

But for most code, all this really means is to not write inefficient
_algorithms_. Don't use an algorithm that is O(N^2), or even O(N log N)
for that matter, when an O(N) algorithm will do.

Differences in implementation of the same algorithm are not likely to
produce a performance difference that the user will notice in most
cases, while other aspects of the implementation such as overall code
maintainability and obviousness of the implementation details often do,
in the form of code that actually _works_ and doesn't have unanticipated
complications.

In addition, while one can test performance of a specific section of
code, there is not even any guarantee that such tests will translate
into a real-world application. There is more to the question of
performance than just what the basic CPU instruction timing can tell
you. For example, code that performs better in a specific scenario, but
which is larger than a similar, simpler version of the same algorithm
may in fact under-perform in other scenarios, whether due to interaction
with surrounding code or differences in the exact hardware
configuration, etc.

And of course, if you manage to squeeze a 50% improvement (an unusually
large optimization result, assuming a correct algorithm has been
designed in the first place) out of code that only consumes 1% or less
of the total execution cost, you haven't achieved anything the user will
ever care about.

So, even if you manage to prove one implementation performs better than
another in a specific situation, that isn't necessarily going to
translate into better performance for the end user.

Every software project has a finite amount of man-hours that can be
applied to it. You are doing your users a disservice if you spend some
of those man-hours optimizing code that has no need of optimization,
rather than doing things like adding features or ensuring that the code
is easily maintained, especially since those hours spent optimizing may
in fact have counter-productive results.

IMHO, the question of whether to use a new instance versus clearing an
existing one should relate more to what makes the code more readable
than which performs better. And in many cases, having an instance local
to a loop and which is initialized in each iteration of the loop is much
more readable and easily-maintained. (And in other cases, it may not be
in which case one would choose an alternate method).

As Patrick and Jon have both said, once you have a complete
implementation, then it makes sense to identify and address any
potential performance problems. At that point, you will know what areas
of the code are actually affecting the user experience, and you will be
able to measure changes in the implementation in a way that takes into
account the context of those changes.

Pete

Peter Duniho · Sep 3, 2007

pamela said:
[...]
In case you are implying that CLEAR would be " less readable ", I can
say that, for instance,
to be really safe, I would always recommend to anyon to use CLEAR and
not NEW because, from my
experience the second way can be *very* dangerous, generally speaking.

How so? How is creating a new instance and assigning to a variable
"very dangerous" as compared to clearing an existing instance already
referenced by the same variable?

You *really* must know what you are doing and watch carefully when you
indulge in a NEW
for a collection with class scope, and requires experience to stay out
of troubles.

What does the scope of the variable have to do with it? If you are
clearing the instance, surely that is just as dangerous as creating a
new one for the same variable.

Pete

Peter Duniho · Sep 3, 2007

pamela said:
Ok I have done a few quick tests.

As to speed only, It comes out that the CLEAR way is 2 times faster
(!).

In what context? How did you measure the difference?

It is entirely possible that clearing a collection takes half the time
that creating a new instance does. But that difference is only relevant
if that's _all_ you are doing.

Both operations should be _very_ inexpensive, so if the code that uses
the collection is doing anything that is at all interesting, I would be
surprised if you found any useful difference in time cost using one
versus the other.

Pete

pamela fluente · Sep 3, 2007

pamela said:
pamela said:

[...]
In case you are implying that CLEAR would be " less readable ", I can
say that, for instance,
to be really safe, I would always recommend to anyon to use CLEAR and
not NEW because, from my
experience the second way can be *very* dangerous, generally speaking.

Click to expand...

How so? How is creating a new instance and assigning to a variable
"very dangerous" as compared to clearing an existing instance already
referenced by the same variable?

You *really* must know what you are doing and watch carefully when you
indulge in a NEW
for a collection with class scope, and requires experience to stay out
of troubles.

Click to expand...

What does the scope of the variable have to do with it? If you are
clearing the instance, surely that is just as dangerous as creating a
new one for the same variable.

Pete

No, sorry. If you say that it means you are missing some important
points.

In complex applications it can occur that the Collection is used in
several other places
of the program. There also may be sublist of its items.

There is a fundamental difference between clearing and making a new
instance.

If you make a new instance, you may be left with a lot of orphans
around.
Trust me. Redefining as New a Collection used as a member of a class
is just the entrance of intricate debugging hell.

-P

Peter Duniho · Sep 3, 2007

pamela said:
No, sorry. If you say that it means you are missing some important
points.

No doubt. That's why I asked for clarification.

In complex applications it can occur that the Collection is used in
several other places
of the program. There also may be sublist of its items.

There is a fundamental difference between clearing and making a new
instance.

Yes, but not to the class where the instance exists. If the collection
is shared, yes...it would be up to the designer of the code to ensure
that the collection is used consistently. But that doesn't affect the
variable that's being changed itself. Hence my question.

You've explained better now what you mean. I would still assert that in
the scenario you describe, obviously performance is NOT the deciding
factor between one design and another. But at least now I have an idea
of what you're talking about.

Trust me. Redefining as New a Collection used as a member of a class
is just the entrance of intricate debugging hell.

Actually, it's not so much the using a new instance versus clearing
that's the issue here, as it is a design that allows for shared use of a
single instance without imposing some rules about how that single
instance is managed.

Pete

=?ISO-8859-1?Q?G=F6ran_Andersson?= · Sep 3, 2007

pamela said:
I would like to hear your *opinion and advice* on best programming
practice under .NET.

Given that several time we cannot change:

MyCollection.Clear into the instantiation of a NEW MyCollection

because we make orphan some needed reference,

I notice also that several times I can "equivalently" decide whether a
collection
(or any other class) be cleared and reused or a NEW one should be
created.

This usually happens with collections having a class scope.

In such cases, where it is functionally "equivalent", and I can
choose, what is more advisable ?

It's better to clear a collection and reuse it. Or just instantiate a
brand new one leaving the old
one to the GC ?

I mean to the purpose of memory usage, speed, GC, etc.

That depends entirely on how you use the collections. Creating a new
collection will free up the allocated memory of the previous collection,
but reusing a collection might minimise reallocation of internal
buffers, as you hold on to the allocated memory. If you are short on
memory, you should definitely create a new collection.

The memory management in .NET is based on the fact that most objects are
short lived, so I would recommend that you just create a new collection
when needed, instead of trying to hold on to objects as long as possible.

Also, if you call Clear that will clear each reference in the
collection, while if you just remove the reference to the collection,
all references in the collection automatically gets unreachable without
any extra work at all.

Michel Posseth [MCP] · Sep 3, 2007

pamela fluente said:
Ok I have done a few quick tests.

As to speed only, It comes out that the CLEAR way is 2 times faster
(!).

hmm, that's seems almost surprising, actually.

-P

Hi Pamela

it isn`t 2 times faster although not on my system DELL Dimension 9200 ,
Dual core 6400 with 2 gb mem , IO stripe disk and 2 Gig 667 mem on windows
media center 2005
it is one time faster and one time slower , and sometimes the differences
are hughe ( for both ) run the test a few times on large collections and
and you wil notice this "strange" behavior

pamela fluente · Sep 3, 2007

"pamela fluente" <[email protected]> schreef in bericht

Hi Pamela

it isn`t 2 times faster although not on my system DELL Dimension 9200 ,
Dual core 6400 with 2 gb mem , IO stripe disk and 2 Gig 667 mem on windows
media center 2005
it is one time faster and one time slower , and sometimes the differences
are hughe ( for both ) run the test a few times on large collections and
and you wil notice this "strange" behavior

I made a simple experiment bu defining a collection of double
called "a" defined at class level and then with code simila to:

for (double i = 0; i <= 10000; i++) {
a.Add(i);
}
Stopwatch s = new Stopwatch();
s.Start();
for (int j = 0; j <= 100000; j++) {
a = new List<double>();
//a.Clear()
for (double i = 0; i <= 1000; i++) {
a.Add(i);
}
}
Interaction.MsgBox(s.Elapsed.TotalMilliseconds);

In various trials, the New took about 6 seconds and the Clear 3
seconds on my pc.

-P

Jon Skeet [C# MVP] · Sep 3, 2007

pamela fluente said:
I made a simple experiment bu defining a collection of double
called "a" defined at class level and then with code simila to:

for (double i = 0; i <= 10000; i++) {
a.Add(i);
}
Stopwatch s = new Stopwatch();
s.Start();
for (int j = 0; j <= 100000; j++) {
a = new List<double>();
//a.Clear()
for (double i = 0; i <= 1000; i++) {
a.Add(i);
}
}
Interaction.MsgBox(s.Elapsed.TotalMilliseconds);

In various trials, the New took about 6 seconds and the Clear 3
seconds on my pc.

So what you mean is "clear a list and populate it with 1000 doubles" is
twice as fast as "create a new list and populate it with 1000 doubles".
There's a big difference there.

What happens if you change 1000 to 5? What happens if you change 1000
to 1000000? What happens if you use the List constructor which takes a
capacity?

More importantly: what does your real application do? And is the time
taken to clear/recreate the list actually significant?

pamela fluente · Sep 3, 2007

So what you mean is "clear a list and populate it with 1000 doubles" is
twice as fast as "create a new list and populate it with 1000 doubles".
There's a big difference there.

What happens if you change 1000 to 5? What happens if you change 1000
to 1000000? What happens if you use the List constructor which takes a
capacity?

More importantly: what does your real application do? And is the time
taken to clear/recreate the list actually significant?

Well, my point is probably

the following.

Although I wanted to hear the Your opinions, my current
believe is that one should *always* use CLEAR ,
because of conceptual reasons (no orphan references), even if much
slower.
(This is based on my experience on some millions lines of code,
but I may well change in the next million

)

Further, if it turned out that reusing is even faster than recreating,
then ... there would be no doubt about the best approach ...

Anyway I really am **interested** in sharing and hearing all the
opinions...

So go ahead ...

-P

Jon Skeet [C# MVP] · Sep 3, 2007

pamela fluente said:
Well, my point is probably the following.

Although I wanted to hear the Your opinions, my current
believe is that one should *always* use CLEAR ,
because of conceptual reasons (no orphan references), even if much
slower.

No, you shouldn't always use Clear. You should use whatever is most
appropriate to the situation. Sometimes you want to *logically* create
a new collection, sometimes you want to *logically* clear an existing
one.

It sounds like you've been stung by a situation where you created a new
collection when you should have cleared an existing one - changing to
*always* calling Clear, you're bound to be stung by the reverse
situation.

Rather than following one rule blindly, take account of different
situations and treat them on a case by case basis.

Doug Semler · Sep 3, 2007

Well, my point is probably the following.

Although I wanted to hear the Your opinions, my current
believe is that one should *always* use CLEAR ,
because of conceptual reasons (no orphan references), even if much
slower.
(This is based on my experience on some millions lines of code,
but I may well change in the next million )

Further, if it turned out that reusing is even faster than recreating,
then ... there would be no doubt about the best approach ...

Anyway I really am **interested** in sharing and hearing all the
opinions...

You also haven't specified whether the lists (or have you?) will be
accessible from multiple threads of execution. Or whether the list is being
used as a backing store...

You referenced "orphans" in your above post. To me, this implies that there
is the possiblility that multiple objects are holding references to the same
list. Otherwise, orphaning (should) not come into the picture, becasuse
once you lose the reference to the list, you lose the reference to the
list's contents.

IIRC, the only speedup you'd see is if the list is rather large, because the
list will keep its maxsize and won't have to grow as you add new elements.
But I would think that time would be better spent focusing on other portions
of code for such minor speedup possibilities.

--
Doug Semler, MCPD
a.a. #705, BAAWA. EAC Guardian of the Horn of the IPU (pbuhh).
The answer is 42; DNRC o-
Gur Hfrarg unf orpbzr fb shyy bs penc gurfr qnlf, abbar rira
erpbtavmrf fvzcyr guvatf yvxr ebg13 nalzber. Fnq, vfa'g vg?

determine object's collection	1	Nov 19, 2007
When do we want to privatize the default parameterless constructor?	2	Apr 26, 2012
Event handlers	2	May 14, 2008
Looking for generic method to access a components form	1	Jun 14, 2011
Opinion Wanted - How to Expose a Collection	12	May 24, 2009
C# Properties && Collections	6	Feb 21, 2008
const	16	Nov 11, 2009
Create a collection of custom classes in another custom class	3	Jun 10, 2011

It's Better CLEAR or NEW ?

pamela fluente

Patrick Steele

Hilton

Jon Skeet [C# MVP]

pamela fluente

pamela fluente

Jon Skeet [C# MVP]

Jon Skeet [C# MVP]

Peter Duniho

Peter Duniho

Peter Duniho

pamela fluente

Peter Duniho

=?ISO-8859-1?Q?G=F6ran_Andersson?=

Michel Posseth [MCP]

pamela fluente

Jon Skeet [C# MVP]

pamela fluente

Jon Skeet [C# MVP]

Doug Semler

Ask a Question

Similar Threads