How does "new" work in a loop?

B

Barry Kelly

Göran Andersson said:
Does that mean that the compiler adds code to remove the reference from
the fs variable? As long as the reference is there, the garbage
collector won't collect the object.

If the 'fs' variable is enregistered, or its location on the stack is
reused for another variable in the interest of reducing stack
consumption, then it may be overwritten and thus won't be visible to the
GC any more.

The JIT doesn't maintain the lifetime of a variable for its entire
lexical scope, except maybe if you've compiled to debug and are running
under the debugger.

You'd be surprised by what the GC will collect. I know I was. I've been
investigating a bug since yesterday evening that was most enlightening,
with respect to this behaviour. It can even collect objects referred to
by the object whose instance method is currently on the stack, under the
right circumstances.

-- Barry
 
M

Michael D. Ober

Barry Kelly said:
If the 'fs' variable is enregistered, or its location on the stack is
reused for another variable in the interest of reducing stack
consumption, then it may be overwritten and thus won't be visible to the
GC any more.

The JIT doesn't maintain the lifetime of a variable for its entire
lexical scope, except maybe if you've compiled to debug and are running
under the debugger.

You'd be surprised by what the GC will collect. I know I was. I've been
investigating a bug since yesterday evening that was most enlightening,
with respect to this behaviour. It can even collect objects referred to
by the object whose instance method is currently on the stack, under the
right circumstances.

-- Barry

The issue here is that when the GC finds an object to collect, it must
follow all the links from that object and collect those first. If it hits a
reference loop, it stops at the object that refers to the start of the
collection link and works backwards.

Mike Ober.
 
J

Jon Skeet [C# MVP]

Göran Andersson said:
Does that mean that the reference is removed from the variable?
Otherwise the garbage collector will still see the reference and can't
collect the object.

No, it just means that the variable's value isn't considered when the
GC works out which references are "live".
 
J

Jon Skeet [C# MVP]

Michael D. Ober said:
The issue here is that when the GC finds an object to collect, it must
follow all the links from that object and collect those first. If it hits a
reference loop, it stops at the object that refers to the start of the
collection link and works backwards.

Firstly, I don't see how that's relevant to this situation.

Secondly, it's just not true. There is nothing to say that a "parent"
object can only be collected after its "children". For one thing, there
can be a cyclic reference, in which case both can be collected in
either order.
 
J

Jon Skeet [C# MVP]

John J. Hughes II said:
Well Jon you can site what is supposed to happen but I have to deal with
what really happens. I write services that run constantly and in some
cases don't return much idle time back to the system for days. I have
found that <var>=null on non-disposable values and using(<statement>) allows
my program to maintain an even memory allocation and stops the memory creep.

Obviously you should use using statements - but that *doesn't* reclaim
any memory.

However, setting things to null when they're about to go out of scope
is *not* helping you. Really, it's not. If you believe it is, please
write a short but complete program that demonstrates it in release
mode. I can easily write a short but complete program which
demonstrates objects being garbage collected before variables referring
to them reach the end of their lexical scope.
I will grant you in my code I am probably using them to excess but having my
customers tell me of memory errors after running my program for X+/- days
depending on load can be really hard to track down, this stopped after
adding the the set to null statements and using statements.

Using statements may well have made a difference to that (particularly
with classes with finalizers) but I'm afraid I just don't believe that
setting variables to null when they're about to go out of scope does
any good - and it clutters up the code.
Note in forms applications I normally don't use them as much being as the
system is normally idle.

As you say "IMO" ;>

Well, I've got good evidence that things can be garbage collected
without local variables being set to null before they go out of scope -
do you have any evidence that *just* setting things to null helps?
 
T

Tony Sinclair

Any help,
including pointers to the VS docs or a popular book on C#, would be
appreciated.

My sincere gratitude to everyone who responded. I was unaware of both
the "using" statement's use in this context, and the difference in GC
behaviour between debug and release versions. I also found Mr.
Skeet's essay on software sins (on his blog) very interesting.

Thank you,
Tony
 
J

Jon Skeet [C# MVP]

Matt said:
Yes, I know this. The OP was a C++ programmer, I was giving it to him
in C++ context. GC is deterministic, it will kick in when it makes
sense to kick in.

But that's the point - it's *different* to C++, so saying what would
happen in a C++ context is misleading. (And the GC is
non-deterministic, not deterministic. I suspect that's what you meant
to say.)
 
I

Ian Semmel

I reckon that if the thing that you are creating is a GDI object like a font,
the system can run out of resources before it gets around to GCing them.

Everything is not as automagic as sometimes assumed.
 
M

Matt

Jon said:
But that's the point - it's *different* to C++, so saying what would
happen in a C++ context is misleading. (And the GC is
non-deterministic, not deterministic. I suspect that's what you meant
to say.)

Yes, that's what I meant to say. Brain was on "off" this morning before
the
first cup of coffee :)

I'm not sure its misleading to help people understand things in the
language
they know best. I came from the C world, moving into C++ was a rather
ugly
experience back then (back in <mumble>) that was made easier by
thinking
of it as first a "better C" (prototypes, being able to define variables
inline, etc).
Only when I really "got" the syntax and structure of C++ could I begin
to think
that way. I think C# is the same way.

Just my $0.02, of course.
Matt
 
J

John J. Hughes II

Ok, next time I find the time I'll do that...

By the way I don't believe that using and or setting a value to null
directly releases memory it just marks the memory as not being needed. I
have found that this is more relevant in nested instances when each class or
value tells the compiler the value is no longer needed. It seems to help
in loops where the new instance is used but does not have as big an impact.

Personally I don't think it really clutters code for a couple of reasons.
First of all setting a value to null at the end of its logical use reminds
me not to use it later, sort of note saying it not available. The second
reason is I normally use them in the dispose call which is forced by the
using statement.

So you end up with something like:

public class myClass
{
byte[] data = byte[1000];
public void dispose()
{
data = null;
}
public void dosomething()
{
}
}

public void fun()
{
using(myClass c = new myClass)
c.dosomething();
}

Regards,
John
 
J

Jon Skeet [C# MVP]

Matt said:
I'm not sure its misleading to help people understand things in the
language they know best.

It is when the truth is different.

You said that objects are cleaned up when they fall out of scope. That
is simply not true. It implies deterministic clean-up which doesn't
exist in .NET.

In particular, someone who has been told and believes that objects are
destroyed deterministically are likely to use finalizers to implement
C++-style RAII - which simply won't work in .NET.
 
J

Jon Skeet [C# MVP]

John J. Hughes II said:
Ok, next time I find the time I'll do that...

By the way I don't believe that using and or setting a value to null
directly releases memory it just marks the memory as not being needed.

No, it does no such thing. Memory is never marked as not being needed -
the "mark" part of "mark and sweep" is marking things which *are* still
needed. Now, if the JIT can tell that a variable is no longer
reachable, it won't use that variable as a root when considering which
objects are still in use.
I have found that this is more relevant in nested instances when each
class or value tells the compiler the value is no longer needed. It
seems to help in loops where the new instance is used but does not
have as big an impact.

That suggests you believe you have some evidence that it has an effect.
I really doubt that you have - in release mode at least. (In debug mode
it would make a difference, but that's not a good reason to add more
code in, IMO.)

Here's some code which demonstrates that the GC doesn't need anything
to be set to null in order to finalize and then free it:

using System;

class Test
{
~Test()
{
Console.WriteLine ("Finalizer called");
}

static void Main()
{
Test t = new Test();

Console.WriteLine ("Calling GC");
GC.Collect();
GC.WaitForPendingFinalizers();

Console.WriteLine ("End of method");
}
}

The results are:
Calling GC
Finalizer called
End of method

So the finalizer is being called before the end of the method - no need
for nulling the variable out. Now, I know that the finalizer being
called isn't the same thing as the object being freed, but it shows
that the GC considers it not to be needed any more.
Personally I don't think it really clutters code for a couple of reasons.
First of all setting a value to null at the end of its logical use reminds
me not to use it later, sort of note saying it not available. The second
reason is I normally use them in the dispose call which is forced by the
using statement.

So you end up with something like:

public class myClass
{
byte[] data = byte[1000];
public void dispose()
{
data = null;
}
public void dosomething()
{
}
}

public void fun()
{
using(myClass c = new myClass)
c.dosomething();
}

If your class doesn't use any unmanaged resources either directly or
indirectly, there's really very little point in implementing
IDisposable in the first place. Just let the object get collected when
the GC notices it's not used - I don't think you're doing anything to
improve garbage collection using the above, but you're forcing yourself
to remember to use the using statement when you really don't need to.
 
J

John J. Hughes II

I do agree the memory is not marked... poor verbiage on my part.

I don't think your example really proves anything since you are calling
garbage collection. I have no argument that when GC runs it will clean up
memory that is not being used. I personally believe that all references to
a variable are not removed in a timely fashion unless you tell them too be.
The key here is timely.

Again as I have said I had a problem with memory creep, the only change I
did was add using statements the problem slowed down but was not eliminated.
The second change was to add value=null statement (shotgun blast style) and
the problem went away. Since it was a production system I used great care
to change as little as possible so I really don't think I fixed any other
problems.

If at some point in the near future if I can give you code which proves my
point I will be happy too but the last time I had the problem it required a
system running full blown for 14 days on average.

That being said I may have gotten my head wet and decided it was raining
when it was snowing. I decide to use an umbrella and my head it not wet
now.

Regards,
John

Jon Skeet said:
John J. Hughes II said:
Ok, next time I find the time I'll do that...

By the way I don't believe that using and or setting a value to null
directly releases memory it just marks the memory as not being needed.

No, it does no such thing. Memory is never marked as not being needed -
the "mark" part of "mark and sweep" is marking things which *are* still
needed. Now, if the JIT can tell that a variable is no longer
reachable, it won't use that variable as a root when considering which
objects are still in use.
I have found that this is more relevant in nested instances when each
class or value tells the compiler the value is no longer needed. It
seems to help in loops where the new instance is used but does not
have as big an impact.

That suggests you believe you have some evidence that it has an effect.
I really doubt that you have - in release mode at least. (In debug mode
it would make a difference, but that's not a good reason to add more
code in, IMO.)

Here's some code which demonstrates that the GC doesn't need anything
to be set to null in order to finalize and then free it:

using System;

class Test
{
~Test()
{
Console.WriteLine ("Finalizer called");
}

static void Main()
{
Test t = new Test();

Console.WriteLine ("Calling GC");
GC.Collect();
GC.WaitForPendingFinalizers();

Console.WriteLine ("End of method");
}
}

The results are:
Calling GC
Finalizer called
End of method

So the finalizer is being called before the end of the method - no need
for nulling the variable out. Now, I know that the finalizer being
called isn't the same thing as the object being freed, but it shows
that the GC considers it not to be needed any more.
Personally I don't think it really clutters code for a couple of reasons.
First of all setting a value to null at the end of its logical use
reminds
me not to use it later, sort of note saying it not available. The
second
reason is I normally use them in the dispose call which is forced by the
using statement.

So you end up with something like:

public class myClass
{
byte[] data = byte[1000];
public void dispose()
{
data = null;
}
public void dosomething()
{
}
}

public void fun()
{
using(myClass c = new myClass)
c.dosomething();
}

If your class doesn't use any unmanaged resources either directly or
indirectly, there's really very little point in implementing
IDisposable in the first place. Just let the object get collected when
the GC notices it's not used - I don't think you're doing anything to
improve garbage collection using the above, but you're forcing yourself
to remember to use the using statement when you really don't need to.
 
J

Jon Skeet [C# MVP]

John J. Hughes II said:
I do agree the memory is not marked... poor verbiage on my part.

I don't think your example really proves anything since you are calling
garbage collection.

Well, I can make an example which ends up garbage collecting due to
other activity if you want. It'll do the same thing. Just change the
call to GC.Collect() to

for (int i=0; i < 10000000; i++)
{
byte[] b = new byte[1000];
}

and you'll see the same thing.
I have no argument that when GC runs it will clean up
memory that is not being used. I personally believe that all references to
a variable are not removed in a timely fashion unless you tell them too be.
The key here is timely.

It's not a matter of the reference being removed. It's a case of the
release-mode garbage collector ignoring variables which are no longer
relevant.
Again as I have said I had a problem with memory creep, the only change I
did was add using statements the problem slowed down but was not eliminated.

And *that* can have a significant impact - because many classes which
implement IDisposable also have finalizers which are suppressed when
you call Dispose. That really *does* affect when the memory can be
freed, and can make a big difference.
The second change was to add value=null statement (shotgun blast style) and
the problem went away. Since it was a production system I used great care
to change as little as possible so I really don't think I fixed any other
problems.

I'm afraid I still don't believe you saw what you claimed to be seeing
- not on a production system. You *would* see improvements in a
debugger, but that's a different matter.
If at some point in the near future if I can give you code which proves my
point I will be happy too but the last time I had the problem it required a
system running full blown for 14 days on average.

That being said I may have gotten my head wet and decided it was raining
when it was snowing. I decide to use an umbrella and my head it not wet
now.

I really suspect you were mistaken, I'm afraid.
 
B

Barry Kelly

Michael D. Ober said:
The issue here is that when the GC finds an object to collect, it must
follow all the links from that object and collect those first. If it hits a
reference loop, it stops at the object that refers to the start of the
collection link and works backwards.

I think this is a different issue, but I still need to comment:

A copying garbage collector actually works differently. The CLR
collector is compacting, which is a kind of copying collector. It looks
for objects that are *alive*, and any space that's left over is garbage
that can be collected. The fact that collection time is inversely
proportional to the amount of garbage means that, with enough garbage,
GC should always outperform manual memory allocation. I'll refer you to
this for more information:

http://citeseer.ist.psu.edu/appel87garbage.html

-- Barry
 
M

Matt

Jon said:
It is when the truth is different.

You said that objects are cleaned up when they fall out of scope. That
is simply not true. It implies deterministic clean-up which doesn't
exist in .NET.

In particular, someone who has been told and believes that objects are
destroyed deterministically are likely to use finalizers to implement
C++-style RAII - which simply won't work in .NET.

I stand corrected. Thanks for the better explanation.

Matt
 
?

=?ISO-8859-1?Q?G=F6ran_Andersson?=

Jon said:
No, it just means that the variable's value isn't considered when the
GC works out which references are "live".

How would the GC know the scope of the variable, when the scope is
something that only the compiler is aware of?
 
?

=?ISO-8859-1?Q?G=F6ran_Andersson?=

That's why the Font class implements IDisposable. When you call Dispose
it will free the GDI resource, so that it doesn't matter when the object
is garbage collected.

Ian said:
I reckon that if the thing that you are creating is a GDI object like a
font, the system can run out of resources before it gets around to GCing
them.

Everything is not as automagic as sometimes assumed.

Tony said:
I'm just learning C#. I'm writing a program (using Visual C# 2005 on
WinXP) to combine several files into one (HKSplit is a popular
freeware program that does this, but it requires all input and output
to be within one directory, and I want to be able to combine files
from different directories into another directory of my choice).

My program seems to work fine, but I'm wondering about this loop:


for (int i = 0; i < numFiles; i++)
{
// read next input file

FileStream fs = new FileStream(fileNames,
FileMode.Open, FileAccess.Read, FileShare.Read);
Byte[] inputBuffer = new Byte[fs.Length];

fs.Read(inputBuffer, 0, (int)fs.Length);
fs.Close();

//append to output stream previously opened as fsOut

fsOut.Write(inputBuffer, 0, (int) inputBuffer.Length);
progBar.Value++;
} // for int i

As you can see, the objects fs and inputBuffer are both created as
"new" each time through the loop, which could be many times. I didn't
think this would work; I just tried it to see what kind of error
message I would get, and I was surprised when it ran. Every test run
has produced perfect results.
So what is happening here? Is the memory being reused, or am I piling
up objects on the heap that will only go away when my program ends, or
am I creating a huge memory leak?
I can see that fs might go away after fs.Close(), but I don't
understand why I'm allowed to recreate the byte array over and over,
without ever disposing of it. I have verifed with the debugger that
the array has a different size each time the input file size changes,
so it really is being reallocated each time through the loop, rather
than just being reused. I've tried to find explanations of how "new"
works in a loop, but I haven't been able to so far. Any help,
including pointers to the VS docs or a popular book on C#, would be
appreciated.
 
?

=?ISO-8859-1?Q?G=F6ran_Andersson?=

Barry said:
If the 'fs' variable is enregistered, or its location on the stack is
reused for another variable in the interest of reducing stack
consumption, then it may be overwritten and thus won't be visible to the
GC any more.

Yes, it might. On the other hand it might not.

In this case it's not very likely that the stack space will be reused
inside the loop, is it? It's needed for the fs variable in the next
iteration of the loop.
 
B

Barry Kelly

Göran Andersson said:
Yes, it might. On the other hand it might not.

In this case it's not very likely that the stack space will be reused
inside the loop, is it?

It's unpredictably likely. Compile this program in release mode and run
it:

---8<---
using System;

class App
{
class A
{
public void Foo()
{
}

~A()
{
Console.WriteLine("A finalized.");
}
}

static unsafe void Main()
{
for (int i = 0; i < 2; ++i)
{
Console.WriteLine("Loop Start");
A a = new A();
a.Foo();
GC.Collect();
GC.WaitForPendingFinalizers();
Console.WriteLine("Loop End");
}
}
}
--->8---

What would you expect it to output? If the slot or whatever for "a"
isn't freed up until the next iteration, this is what I'd expect:

---8<---
Loop Start
Loop End
Loop Start
A finalized
Loop End
A finalized
--->8---

But that isn't what happens:

---8<---
Loop Start
A finalized.
Loop End
Loop Start
A finalized.
Loop End
--->8---

GC can be surprisingly proactive. I've made an entry on my blog today on
precisely this topic:

http://barrkel.blogspot.com/2006/07/not-so-lazy-garbage-collector.html

-- Barry
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top