Memory GC - loop

  • Thread starter Thread starter Michael Moreno
  • Start date Start date
M

Michael Moreno

Hello,

Would you know what is best practice please between say:

CODE 1:

TimeSpan ts;

for (i=0; i<1000; i++)
{
ts = ...;
blablabla;
}

and

CODE 2:

for (i=0; i<1000; i++)
{
TimeSpan ts = ...;
blablabla;
}


It seems to me they may both impact differently on the GC.
Clearly, I used a timeSpan but it could be anything else.
What do you think is best?
does the GC behaves the same way in both cases?

Thanks.
 
One thing is sure:

CODE 2 is slower because it creates new reference to TimeSpan ts in each
loop.

But objects probably disposed same why, because when you set refrence to new
object GC collect old object.
 
In theory Code1 should create one variable that remains in scope for
the entire function and override its contents over and over (unless you
have ts = new TimeSpan() inside the loop), Code 2 would create a new
variable on each iteration of the loop that goes out of scope every
time.
Personally I try not to declare variables inside of a loop, but I never
actually checked how this influences memory usage/runtime behavior.

Sincerely,
Kevin Wienhold
 
Yes it behaves the same, for 2 very different reasons:

First - unless you use "captured variables" (anonymous delegates),
then the nesting of variables is one of the very few things that is
lost when compiling to IL - since all variables are part of the
preamble (".locals init").

Second - TimeSpan is a struct and is not handled by the GC; in either
case there is *one* memory address (on the stack). When you do " = new
TimeSpan(5)", you over-write that memory. When the method exits the
stack is reclaimed and the variable disappears into the ether. No
objects : no GC.

Marc
 
It makes no difference whatsoever; e.g.

TimeSpan ts1;
for(int i = 0; i <100; i++) {
ts1 = new TimeSpan(i);
}
for (int i = 0; i < 100; i++) {
TimeSpan ts2 = new TimeSpan(i);
}

compiles to:

.entrypoint
.maxstack 2
.locals init (
[0] [mscorlib]System.TimeSpan span1,
[1] int32 num1,
[2] [mscorlib]System.TimeSpan span2,
[3] bool flag1)
L_0000: nop
L_0001: ldc.i4.0
L_0002: stloc.1
<SNIP>

Local variable names are lost, but notice that span1 and span2 are
both declared method-wide.

Marc
 
I'm not trying to labor this point, but I think that understanding GC
is *so* imporant that I'll post a 3rd reply on this chain...

IL> CODE 2 is slower because it creates new reference to TimeSpan ts
in each loop
MG: no it isn't / there are no references / there is only 1 "ts"

IL> But objects probably disposed same why,...
MG: well, they aren't objects (meaning: reference-types), and even if
they were they aren't "disposed"

IL> because when you set refrence to new object GC collect old object.
MG: this is *very* misleading, even when talking about reference-type
objects; this would only happen so readily in a reference-counted
world. In a GC world, the old object simply becomes eligible for
collection once dereferenced - that is all you can say for sure. GC
does not run continually - rather there are some complex rules that
make it run periodically with extra focus if memory becomes tight - or
it can be invoked manually, but this is not a good idea unless you
really, really know exactly what you are doing and have profiler
evidence to support this strategy. And code comments detailing it all
;-p

Marc
 
Thank you for clearing that up, I wasn't sure how the compiler would
handle this kind of code when turning it into IL code.
In the case of a value type it wouldn't make a difference anyways, but
I assumed the question was aimed at objects in general and thought that
in case of a reference type it would.

Sincerely,
Kevin Wienhold
 
In the case of objects, GC behaves *almost* as you expect. Even if you
nest the variable, it is flattened at runtime into a single stack
variable. I believe that GC will ignore the object currently held in
the stack variable until (at least) the last "read" of that variable
(over *all* iterations) - this is *almost* the same as saying that
each object lives until the end of the loop, but actually I suspect
that if GC ran *right at the start* of a loop (before the variable is
assigned), then the object from the previous loop (still referenced in
the stack) will be retained. All prior objects no longer referenced in
the stack will be eligible for collection.

I don't *think* (happy to be corrected) that the IL wipes the variable
at the end of each loop (the C# validation making that unnecessary).
If you had a long running loop, with a loop variable that is often not
used, but sometimes contains a big object (perhaps a large byte[] on
the LOH), then there may be a case for explicitely setting the
variable to null at the end of each iteration (where it was used) so
that the large opject can be collected early if the next 2754 loop
iterations don't touch that variable. Of course, there is a chance the
compiler might decide to no-op this instruction, but you do what you
can...

Marc
 
Hi,

|
| One thing is sure:
|
| CODE 2 is slower because it creates new reference to TimeSpan ts in each
| loop.

Incorrect, the only difference is that CODE 1 will have ts available AFTER
the for loop. There is no more memory use in the second variant. Only one
instance of TimeSpan is created

| But objects probably disposed same why, because when you set refrence to
new
| object GC collect old object.

That is true but for the incorrect reason. There is only 1 instance so no
new object is created ever. They are disposed the same way but for other
reasons, TimeSpan is a struct so THE GC HAS NOTHING TO DO WITH IT. It's only
when the method call returns that the memory is cleared.

Take a look at http://www.yoda.arachsys.com/csharp/memory.html very good
explanation of memory management in c#
 
HI,

|
| In theory Code1 should create one variable that remains in scope for
| the entire function and override its contents over and over (unless you
| have ts = new TimeSpan() inside the loop), Code 2 would create a new
| variable on each iteration of the loop that goes out of scope every
| time.

This has been clearly answer by Marc.

| Personally I try not to declare variables inside of a loop, but I never
| actually checked how this influences memory usage/runtime behavior.

Quite the opposite, you should declare a variable in the inner scope
possible, this will prevent "polution" in the code and the use of the
variable is clearly marked by the scope.
In the OP case, if you declare ts inside the loop it's clear that you only
use it ,well inside the loop. if you declare it outside there is doubts
about if the variable is used AFTER the loop.

My recommendation is to use the tighter scope possible always
 
Hi,

|
| Thank you for clearing that up, I wasn't sure how the compiler would
| handle this kind of code when turning it into IL code.
| In the case of a value type it wouldn't make a difference anyways, but
| I assumed the question was aimed at objects in general and thought that
| in case of a reference type it would.

It depends of the interpretation of
ts = .....

if it create a new instance like
ts = new .....

Yes you will create a new instance each time, but even in this case both
codes will be similar, EXCEPT that CODE 1 will be "slower" as it create one
more instance (outside the loop) than CODE 2
 
Quite the opposite, you should declare a variable in the inner scope
possible, this will prevent "polution" in the code and the use of the
variable is clearly marked by the scope.
In the OP case, if you declare ts inside the loop it's clear that you only
use it ,well inside the loop. if you declare it outside there is doubts
about if the variable is used AFTER the loop.

My recommendation is to use the tighter scope possible always
From a readability point of view you are certainly right about that,
but the initial question was aimed towards performance issues, in which
case the first implementation would be favorable imho (unless you use
"new" inside the loop, as stated).

Sincerely,
Kevin Wienhold
 
but the initial question was aimed towards performance issues, in which
case the first implementation would be favorable imho (unless you use
"new" inside the loop, as stated).

Thanks.
Yes there is always a "new" in the loop and I could have put a class
instead of a Timespan.
 
In OPs original code 1, the variable ts isn't instantiated outside the loop.
Thus, it's value will be null. There is no performance penalty other than
the single machine instruction required to set a memory location to 0.

Mike Ober.
 
Michael D. Ober said:
In OPs original code 1, the variable ts isn't instantiated outside the
loop. Thus, it's value will be null. There is no performance penalty
other than the single machine instruction required to set a memory
location to 0.

Mike Ober.

Not even that, the variable simply will be not definitly assigned. But the
compiler will optimize the difference away.

Christof
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top