Best practice using large objects in foreach

Benny · Jul 27, 2006

I just wanted to throw the discussion out there on what the best
practice people feel is for using large objects in a foreach loop. For
example if you are reusing an Image object in a loop like this (letters
above snippets are for reference purposes):

A
foreach ( string s in myList )
{
Image img = Image.FromFile( s );
// operate on img
}

would it run more efficiently like this?

B
Image img = null;
foreach ( string s in myList )
{
img = Image.FromFile( s );
// operate on img
}

or

C
foreach ( string s in myList )
{
using ( Image img = Image.FromFile( s );
{
// operate on img
}
}

the list goes on....

Let me know what your thoughts are.

Marc Gravell · Jul 27, 2006

In none of the cases are you re-using the object; you are just re-using
the variable, which is just (effectively) an integer, so no real size.
In fact, the nesting is likely to be normalized by the compiler, so no
real difference between A and B. Even if you could img.LoadFrom (i.e.
using the same object) it would likely assign a new memory block
internally, so you wouldn't re-use the "big" portion of memory.

Disposing a disposable object is a good thing for loops like this, so C
is good. Do this ;-p

Marc

Benny · Jul 27, 2006

So by using A and B, nothing is disposed automatically by the GC once
it hits the end of the loop?

Laura T · Jul 27, 2006

I don't see any object reuse, at most variable reuse. Compiler does not care
much about it. Nor does CLR.
From the performance point of view, A & B are the same. C is slower because
of using().

Marc Gravell · Jul 27, 2006

No; you need to (in your mind) separate *object* lifetime, and
*variable* lifetime. img is the variable; it isn't img that you are
trying to dispose, but the object that it points to (at that time).

Consider:

Image img = Image.FromFile(whatever);
img = Image.FromFile(somethingElse);
img.Dispose();

here, I have 1 variable, but 2 objects, only one of which is disposed;
when I perform the second "img = " assignment, the first image stays in
memory, as an orphan on the managed heap. The garbage collector [GC]
will probably spot it after a while and destroy it (I'm assuming it has
a finaliser or similar if it uses unmanaged image handles).

The second object will also be orphaned when our code completes, making
it answerable to the GC - but in this case the finalizer will have
probably been cancelled by the Dispose() call (after freeing the
resources). This has 2 advantages: firstly, the resources are freed
much sooner (i.e. when you are done with them, rather than when the GC
spots them), and secondly the GC can destroy the object in one pass
rather than two (if an object needs finalizing, then it finalizes the
first time the GC spots it, and takes it off the managed heap on the
second pass).

C is the good code. Stick with this. You could add disposal to A or B,
but wy bother? You'd still want to dispose *each image* as you are done
with it (not the single variable). You can't use the "using" syntax
with an existing variable (img), which means you'd need a try /
finally... so it would make the code more complex (to do properly).

Marc

Marc Gravell · Jul 27, 2006

No; C is /vastly/ more efficient overall, as the memory footprint is
minimised during processing due to early disposal. In A & B this is
performed by the GC, so you will see (for a large set) the memory usage
ramp for a while, then processing grind to a halt as the GC tears down
the improperly discarded images, and then the memory start ramping up
again. I would anticipate C to stay fairly constant both in terms of
memory footprint and processing rate, and it will put much less stress
on the rest of the system - important in a server environment.

Marc

Marc Gravell · Jul 27, 2006

No; C is /vastly/ more efficient overall...

(this was in response to Laura's post; sorry if that was unclear)

Marc Gravell · Jul 27, 2006

So by using A and B, nothing is disposed automatically by the GC once

it hits the end of the loop?

A further clarification:
a: the GC does not dispose; it finalises (if required)
b: the GC doesn't care about "the end of the loop"; it runs on its own
thread, and hunts for abandoned objects in its own sweet time. The CLR
doesn't use reference counting etc (e.g. from COM) that would allow it
to destroy things when they go out of scope, as this is prone to orphan
islands of memory (memory leaks). Rather, it periodically scans
*everything* to see what can still be reached by an active (reachable)
object.

Marc

Barry Kelly · Jul 27, 2006

Benny said:
I just wanted to throw the discussion out there on what the best
practice people feel is for using large objects in a foreach loop. For
example if you are reusing an Image object in a loop like this (letters
above snippets are for reference purposes):

There are two issues involved here:

1) The scope of locals

2) Disposal of objects which implement IDisposable

The basic rules of thumb, correspondingly, are:

1) The scope should be as small as possible, but large enough so the
variable is visible everywhere it's needed in the function. Preferably,
locals shouldn't be declared until there's a value to initialize them
with - this isn't always possible, though.

2) Locals which implement IDisposable should be used inside a 'using' if
possible; fields which implement IDisposable should be disposed of in
the Dispose(bool) method, and the class should implement the Dispose
pattern.

C
foreach ( string s in myList )
{
using ( Image img = Image.FromFile( s );
{
// operate on img
}
}

So, the above is the best option, IMHO.

About local scoping: the scope information of the variable is lost after
compiling. The JIT compiler in the CLR works out the actual lifetime of
the local by analysing when and where it is used. If you try compiling
two programs, one which uses your option B and one which uses C, you'll
find that they compile to the same IL.

-- Barry

Marc Gravell · Jul 27, 2006

one which uses your option B and one which uses C, you'll

find that they compile to the same IL

I think you mean A and B, plus there's the trivial "=null" assignment
to watch for (although this might not actually create IL, since it is
essentially zero; I can't remember...)

Marc

Barry Kelly · Jul 27, 2006

two programs, one which uses your option B and one which uses C, you'll

option A and option B, compile to the same IL, not C.

-- Barry

Laura T · Jul 27, 2006

Yes. The C version is the correct way.
I dont't think there would be no halt because the objects remain in gen1.
And on the server side GC is concurrent.
But yes, it's (almost) always imperative to Dispose() as fast as you can.

Narrow bezel on a metallic chassis – MSI PS42 14” Thin & Light Laptop Review	4	Nov 16, 2018
defining variable in loop	2	Nov 10, 2010
A parameter is not valid	1	Dec 6, 2009
Latest Core i7-10875H & RTX 2070 Super–AERO 17 HDR Creator Laptop Unboxing	3	May 5, 2020
WIA 2.0 sample converted to C# (almost done, just need a little he	8	Feb 17, 2009
Issue foreach loop	4	Feb 1, 2006
I don't know why this fixes this... any ideas? (System.Drawing.Image.FromStream "parameter is not v	7	Oct 27, 2006
foreach & generics syntax proposal	3	Mar 30, 2006

Best practice using large objects in foreach

Benny

Marc Gravell

Benny

Laura T

Marc Gravell

Marc Gravell

Marc Gravell

Marc Gravell

Barry Kelly

Marc Gravell

Barry Kelly

Laura T

Ask a Question

Similar Threads