textbook authors: passing by ref is NOT more efficient!

A

Aaron Watters

Hi. I've been teaching C# and I'm tired of students
telling me that they passed ints by ref when they don't
need to because it's "more efficient". I've seen this myth
repeated in many textbooks, but it's completely wrong
in C# (and it's always wrong for ints in any language).

In the case of C++ if you are passing arrays (if I recall)
it can be pretty important to pass by ref and not by
value because otherwise the whole (3GB) array gets
copied onto the call stack.

However, in the case of C# all arrays
are reference types, so you always actually copy a
reference to the array object even if you don't say "ref",
so there is no imperative to pass by ref unless you want
to change the binding of the variable naming the array.

For all common value types the size of a reference to the
value is about the same size as the size of the value itself,
and you are adding an additional level of indirection
-- so you are not saving anything in space or time by passing
by ref when you don't have to.

The only exception is "structs" which can hypothetically
be large, but they are only used for special purposes
and should be avoided in most vanilla programming.

Please correct me if I'm wrong, or if not, please
*stop* *confusing* *my* *students* who should always
pass by value unless it's absolutely necessary to pass
by ref.

Annoyed: -- Aaron (Stagnant) Watters

===
http://www.xfeedme.com/nucular/gut.py/go?FREETEXT=immoral+english
 
J

Jarlaxle

passing by ref passes the address which would be 32 or 64bits depending on
the processor architecture.

an int is 32 bits even on a 64bit processor so technically passing by ref
could be less efficient (trival but true). so tell your students that!

arrays or classes pass the ref anyway so you only need to pass by ref if you
need to change the pointer itself (i think of it in c++ terms...parm**)

the only other time i find it effective to pass by ref (in c#) is for
structs to pass the address.

unfortuantely no one agrees with me that there should be a way to return by
reference (the compiler could only allow member variables or static variables
to be returned by reference).
 
M

Marc Gravell

unfortuantely no one agrees with me that there should be a way
to return by reference
There is kind-of a way to return by reference... "out"... except the caller
supplies the address, which makes perfect sense since the caller's stack is
still valid. In reality it is almost identical to "ref", except the caller
doesn't have to initialize the value first.

And re "ref" on structs - a side issue here is mutability; of course,
structs shouldn't normally be mutable, but if you pass a struct by value,
any changes the callee makes are discarded; if passed by ref they are
retained. But I stress: they shouldn't normally be mutable to start with!

Marc
 
C

Cowboy \(Gregory A. Beamer\)

I am not sure what textbook author (or which authors) you are talking about,
so I cannot assess the comment in context. I would agree, overall, with the
blanket statement that passing by ref is "more efficient" if we are seeing
efficiency in terms of performance and we are talking reference objects. If
I run a million iterations, I will see a slight performance benefit by
passing by reference, as I am not allocating memory in many cases.

But, I am trading "safety" for "efficiency" here. Take, for example, the
classic example (using int values, as you have mentioned this):

static void Main()
{
int value = 1;

Console.WriteLine("Value before by valroutine: {0}", value);
Console.WriteLine(GetNextNumberVal(value));
Console.WriteLine("Value after by valroutine {0}", value);

Console.WriteLine(String.Empty);
Console.WriteLine("Value before by ref routine: {0}", value);
Console.WriteLine(GetNextNumberRef(ref value));
Console.WriteLine("Value after by ref routine {0}", value);

Console.Read();
}

private static string GetNextNumberRef(ref int input)
{
input++;
return String.Format("The next number is {0}", input);
}

private static string GetNextNumberVal(int input)
{
input++;
return String.Format("The next number is {0}", input);
}

Of course, I have seen people who purposefully use by ref to allow a routine
to accomplish two jobs, which is also very dangerous.

Short version: I agree with your statements, but would have to see the
printed "efficiency" statement to determine whether, in context, it was
valid.

--
Gregory A. Beamer
MVP, MCP: +I, SE, SD, DBA

Subscribe to my blog
http://gregorybeamer.spaces.live.com/lists/feed.rss

or just read it:
http://gregorybeamer.spaces.live.com/

*************************************************
| Think outside the box!
|
*************************************************
 
J

Jon Skeet [C# MVP]

Cowboy (Gregory A. Beamer) said:
I am not sure what textbook author (or which authors) you are talking about,
so I cannot assess the comment in context. I would agree, overall, with the
blanket statement that passing by ref is "more efficient" if we are seeing
efficiency in terms of performance and we are talking reference objects. If
I run a million iterations, I will see a slight performance benefit by
passing by reference, as I am not allocating memory in many cases.

Why would any more memory be allocated by passing a reference by value
than by passing a reference by reference?

Can you give a complete example which shows this efficiency gain, with
reference types?
 
C

Cowboy \(Gregory A. Beamer\)

Jon Skeet said:
Cowboy (Gregory A. Beamer) <[email protected]> wrote:
Why would any more memory be allocated by passing a reference by value
than by passing a reference by reference?

The norm, with value passing, is to create a copy of the objects. This is
not always true, and I have to admit I fired off based on generalities,
rather than examining how the CLR handles this. As it was not my primary
point (which was that one should not code for performance (efficiency)
alone), I did not feel it warranted a long discussion. :)

The OP mentions one case with arrays. Once again, the CLR may have a
different handling; I have not reversed the source or examined the reference
copies to figure out if it is true or false. If false, then I stand
corrected. It is really not that important, as I was agreeing with the OP
overall.
Can you give a complete example which shows this efficiency gain, with
reference types?

Not for most applications, which was the real point of my post. I am more an
advocate of developing for maintainability than performance. I was also
giving the benefit of the doubt, as I am certain of the perf benefits in
many systems and assume the CLR is still using the same basic working design
(stack versus heap, copying versus passing pointers, etc.).

The bulk of my post focused on the "even if there is more performance, the
safety issues outweigh the micro gain in perf." I believe we both can agree
on that.

--
Gregory A. Beamer
MVP, MCP: +I, SE, SD, DBA

Subscribe to my blog
http://gregorybeamer.spaces.live.com/lists/feed.rss

or just read it:
http://gregorybeamer.spaces.live.com/

*************************************************
| Think outside the box!
|
*************************************************
 
J

Jarlaxle

yes i agree completely...the only problem are properties. if you have a clas
sthat contains structs you can't return the structs by a property and have
them modified...

class a
{
private struct _datacontainer;
public mystruct DataContainer
{
get { return _datacontainer; }
}
}

a.DataContainer.member = true; //compile error

should be able to do a...

public ref mystruct DataContainer
{
get { return _datacontainer; }
}

yes i know you can make it a class...but this is a trivial example and in
other examples with millions of iterations...i think there is value by
returning by ref (pun intended) instead of having to make it a class.
 
M

Marc Gravell

The norm, with value passing, is to create a copy of the objects.
Ahh; the confusing may have come from <q>and we are talking reference
objects</q> - when of course no clone occurs; we simpy push/pop the address
on the stack - very similar to how the address of the variable (on the
stack) is pushed/popped for passing the variable by reference.

Indeed, the only time we truly consume memory here is when boxing a value
onto the heap; even when blitting a value onto the stack we are just using
memory that is already allocated for the stack. As a consequence, if you
pass huge structs on the stack you run the chance of stack-overflow sooner,
but if you have huge structs you already have a problem ;-p

Marc
 
J

Jon Skeet [C# MVP]

Cowboy (Gregory A. Beamer) said:
The norm, with value passing, is to create a copy of the objects.

Not in C# it's not. The value of the expression is passed, and that
value is a reference - just a pointer. That's all that's copied, not
the whole object.
This is
not always true, and I have to admit I fired off based on generalities,
rather than examining how the CLR handles this. As it was not my primary
point (which was that one should not code for performance (efficiency)
alone), I did not feel it warranted a long discussion. :)

It's an incredibly important point though.
The OP mentions one case with arrays. Once again, the CLR may have a
different handling; I have not reversed the source or examined the reference
copies to figure out if it is true or false. If false, then I stand
corrected. It is really not that important, as I was agreeing with the OP
overall.

There's no need to reverse the source - simple specs are all that are
required. The behaviour is very simple to see:

using System;

public class Test
{
static void Main()
{
string[] array = { "a", "b" };

ChangeContents(array);

// Prints "c"
Console.WriteLine(array[0]);
}

static void ChangeContents(string[] x)
{
x[0] = "c";
}
}

This shows that although the reference is passed by value, the object
(the array itself) is emphatically *not* copied.
Not for most applications, which was the real point of my post. I am more an
advocate of developing for maintainability than performance. I was also
giving the benefit of the doubt, as I am certain of the perf benefits in
many systems and assume the CLR is still using the same basic working design
(stack versus heap, copying versus passing pointers, etc.).

The bulk of my post focused on the "even if there is more performance, the
safety issues outweigh the micro gain in perf." I believe we both can agree
on that.

Yes - but I think it's absolutely *vital* to understand how reference
types work, and that in particular they do *not* involve copying
objects when a reference is passed by value.
 
M

Marc Gravell

Ah, then we disagree:

I'd argue quite strongly that mystruct should be immutable, in which case
this problem doesn't exist; to update a property you create a *new* mystruct
and push it into the property setter to update.

Marc
 
J

Jon Skeet [C# MVP]

Jarlaxle said:
yes i agree completely...the only problem are properties. if you have a clas
sthat contains structs you can't return the structs by a property and have
them modified...

Ah, so you not only want return by reference, but you specifically want
them for mutable structs.

Aargh, aargh, aargh.

Mutable structs are to be avoided like the plague. They produce all
sorts of very subtle issues. Combining them with "return by reference"
sounds like a maintenance *nightmare* to me.
 
C

Cowboy \(Gregory A. Beamer\)

I just coded up some loops with a timer function and end up saving
approximately far less than 1 tick per iteration by using reference instead
of value on a large array (BS array, pure theory, no practical application).

In order to register difference, I had to use a rather large loop with do
nothing routines. I am also sure that some of the overhead shown could be
..NET (need longer testing to see if the inefficiency bounces to the ref
side).

If the findings merit out to be correct (not willing to spend the time),
then technically, by ref is more performant (or efficient). In reality, I
doubt someone is doing 1,000,000 iterations in a fraction of a second on a
burst, much less on a regular basis, which leads me back to my point: Coding
for efficiency or performance should not be one's first goal.

On this application, it would take hours at load to yield seconds difference
in performance. There are few applications that need to be tuned to the
point where one would have to grab these extra ticks. In those cases, I
would venture there are more practical ways to tweak the code for the extra
performance.

As an aside, I have very few instances in my code (cannot even think of one
off hand) where I will pass in by ref. It was more common in my COM code,
due to the prevailing patterns at that time.

--
Gregory A. Beamer
MVP, MCP: +I, SE, SD, DBA

Subscribe to my blog
http://gregorybeamer.spaces.live.com/lists/feed.rss

or just read it:
http://gregorybeamer.spaces.live.com/

*************************************************
| Think outside the box!
|
*************************************************
 
J

Jon Skeet [C# MVP]

If the findings merit out to be correct (not willing to spend the time)

I strongly suspect you would find that they're not.
then technically, by ref is more performant (or efficient). In reality, I
doubt someone is doing 1,000,000 iterations in a fraction of a second on a
burst, much less on a regular basis, which leads me back to my point: Coding
for efficiency or performance should not be one's first goal.

That in itself is a fine point - but it's one which is obscured by the
erroneous idea that when you pass a reference by value it copies the
referenced object.
 
L

Lasse Vågsæther Karlsen

Jon said:
Ah, so you not only want return by reference, but you specifically want
them for mutable structs.

Aargh, aargh, aargh.

Mutable structs are to be avoided like the plague. They produce all
sorts of very subtle issues. Combining them with "return by reference"
sounds like a maintenance *nightmare* to me.

Wouldn't this function pretty much like classes anyway?

Sounds like he wants something like the old "Object" type of Delphi,
which could function both as a value type, and through a pointer, as a
reference type. Borland dropped this in favor of the pure heap-based
class though, as all sorts of initialization problems cropped up when
using virtual methods and value-types.
 
C

Cowboy \(Gregory A. Beamer\)

Jon Skeet said:
I strongly suspect you would find that they're not.

That is highly possible. Underneath the hood, it would be interesting to see
what is happening.

I actually thought of another way to examine the internal workings, through
the ability to run through the PDBs in Visual Studio 2008.
That in itself is a fine point - but it's one which is obscured by the
erroneous idea that when you pass a reference by value it copies the
referenced object.

Regardless, my primary point here was "even if" you should not do it, as the
small amount of perf gain (if there is one) is so negligible that it is not
worth it. And, since you can end up with bad artifacts, you can end up worse
off than passing by val.

I got on that bad wagon (perf is not everything) years ago when someone
posted an article on ASPToday.com stating that removing ASP comments made a
more performant application. While technically true, you either spend time
creating a comment stripping application (a waste, IMO) or you end up
leaving your application where it has little or no maintainability (bad).

I may revisit the br ref/by val argument later. ;-)

--
Gregory A. Beamer
MVP, MCP: +I, SE, SD, DBA

Subscribe to my blog
http://gregorybeamer.spaces.live.com/lists/feed.rss

or just read it:
http://gregorybeamer.spaces.live.com/

*************************************************
| Think outside the box!
|
*************************************************
 
C

Cowboy \(Gregory A. Beamer\)

Marc Gravell said:
As a consequence, if you pass huge structs on the stack you run the chance
of stack-overflow sooner, but if you have huge structs you already have a
problem ;-p

Compound that with people who try using only structs (no classes) to attempt
to speed up their applications. I had to detangle an app a few years ago
where the "architect/designer" felt that having all of his "objects" in the
stack would speed up his application, so he forced the dev team to code
classes as structs. As the application grew, and the number of users grew
.... I think the point is already made.

--
Gregory A. Beamer
MVP, MCP: +I, SE, SD, DBA

Subscribe to my blog
http://gregorybeamer.spaces.live.com/lists/feed.rss

or just read it:
http://gregorybeamer.spaces.live.com/

*************************************************
| Think outside the box!
|
*************************************************
 
I

Ian Semmel

I reckon you should tell your students to get on with the job at hand
instead of persuing the anal-retaining pastime of trying to out-guess
the C# compiler.

People who engage in this form of "efficiency" usually end up with
highly inefficient programs as they are concentrating on areas which
have no bearing on the application they are trying to develop. Ask them
how long it takes to write a screen compared to the time difference in
using 'ref' or not.
 
B

Ben Voigt [C++ MVP]

Aaron said:
Hi. I've been teaching C# and I'm tired of students
telling me that they passed ints by ref when they don't
need to because it's "more efficient". I've seen this myth
repeated in many textbooks, but it's completely wrong
in C# (and it's always wrong for ints in any language).

As the instructor, it's your responsibility to select a textbook that
doesn't make such fundamental mistakes.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top