'null' references

Jon Skeet [C# MVP] · Jun 16, 2007

valentin tihomirov said:
But you cannot read the 'out' parameter in order to accoumplish the trick of
passing 'null' reference.

Do you remember how early on I asked you to describe what you were
trying to achieve *without referring to other languages*? I can't help
but feel this is still the only way forward in this discussion...

Jon Skeet [C# MVP] · Jun 16, 2007

Larry Smith said:
You're explaining the language's existing rules which is fine, but I'm
questioning why it was done that way in the first place. Let's say I have a
function like "_splitpath" in the CRT. It takes a full path name as arg1 and
returns its drive letter, directory, file name and extension in the next
four "out" parameters.

And that's a pretty hideous design, when it comes to OO, isn't it? C#
is designed as an OO language, where you'd want to encapsulate those
output parameters in a type.

That occasionally makes it hard to use APIs which are completely non-
OO, as in this case.

I'm only interested in the directory for instance so
I want to pass "null" for the other "out" arguments. I can't do this however
which is very inconvenient. I'm forced to pass all "out" arguments IOW.
Someone might argue not to design methods like this but there's nothing
inherently wrong with it so it's ultimately the programmer's choice. It may
also prove more efficient in some cases, letting the function know it
doesn't have to carry out the extra work of retrieving all "out" parameters.

I'd say that's a pretty nasty way of letting the method know that - I'd
far prefer to use an option of what to retrieve, or an overload.

Nor do I have to create additional methods to retrieve each individual piece
of info or otherwise create an extra class to store all this data.

No, but that would be a more OO-approach, and a much tidier one IMO.

Sometimes I'm even just intersested in a function's return value
without caring about the "out" parameter(s). Barring any tangible
constraints on the language's ability to implement this feature, I
see no valid reason why the rules can't and don't support it.

I can't think of any way of doing this without losing some elegance.
What would happen if you *did* try to assign a value in method
accepting a ref/out parameter which had been called with null? If it
would go bang, that creates additional work for *every* method which
takes ref/out parameters, for one thing.

I rather like the inability to read an "out" parameter before it's been
definitely assigned in the method - it makes it very simple to describe
the behaviour and the requirement that an out parameter is definitely
assigned before a method terminates normally.

This occasionally makes it harder to work with non-OO APIs, but I'd
rather that was penalised in favour those who *are* taking an OO
approach than vice versa.

Larry Smith · Jun 16, 2007

I completely disagree. There's nothing inherently ugly about it at all. It's
very clean in code and C/C++ developers have been using it for decades. Most
of the time you're dealing with one or two "out" parameters only so passing
"null" is a very quick and convenient (and consistent) way to tell any
function not to fill something in. It's highly legible (self-describing) and
no extraneous flags or properties are required to control what you want
(which means more work and more code). As for it being error-prone, that
argument is almost groundless in practice. It's no more error-prone than
passing null for a non-out parameter if the function isn't designed to
handle it. Any function should be prepared to handle null even if it means a
simple Assert at the very least. For an "out" parmeter, one simple check for
null is all that's required to make it optional. I've been in the coding
trenches for almost 25 years and can't recall anytime I had an issue with
this.

Barry Kelly · Jun 16, 2007

valentin said:
As explained in "Using pointers vs. references"
http://groups.google.ee/group/borla...0f161fc1e9c/ab294c7b02e8faca#ab294c7b02e8faca ,
the pointers are allowed to be null, while references must refer an existing
variable of required type. The null is normally used for making optional
parameters. But there is no way to pass null reference in C#. Something is
missing.

That's right, there's no way to verifiably pass a reference to an
invalid location logically corresponding to null in .NET. It could be
implemented - and memory safety preserved - since the hardware would
still trigger access violation exceptions, but it isn't.

It's not a big deal, though. The benefit of being able to cause null
reference exceptions when assigning to parameters, or avoid having to
create a dummy local for passing into a routine, is pretty marginal.

-- Barry

Barry Kelly · Jun 16, 2007

valentin said:
THEY ARE NOT !!!

There's no need to shout, and C# / .NET's concept of a ref parameter is
a limited subset of C++'s '&' reference parameters. They're not the
same, true, but they are strongly related.

Show me where I tell that Delphi/C++ references are different from C#
references? I repeat once again, THESE ARE POINTERS WHICH ARE DIFFERENT FROM
REFERENCES !!!

References are implemented on the machine as pointers. The CPU has no
model for a 'reference'; only a memory address, i.e. pointer. Higher
levels, such as Delphi, C++ or .NET, make a distinction for type safety
and (in the case of .NET) memory safety reasons.

Importing a dll function, one may declare their arguments wheter as poitners
of references. Both will work, because internal mechanics, the
implementation is the same. But reference prevents passing 'null'. The
compiler will not allow. They must refer existing object.

Yes. However, when you see an API function declaration in Delphi that
uses 'var' or 'out' in the parameter declaration list where the API
documentation indicates that the parameter is actually optional, the
declaration is in fact in error - it should have been declared to take a
pointer instead of a reference.

Sure, you can use something like PFoo(nil)^ to pass a reference to an
invalid null location, and thereby get around the bug in the declaration
- but you can't do that in C# / .NET. Microsoft could have implemented
it, but they didn't. That's the way it is; it's not a big loss.

-- Barry

Ben Voigt [C++ MVP] · Jun 16, 2007

Austin Ehlers said:
But is that a reference type? Besides IPAddress.TryParse, they're all
structs. (Notice I said "usually bad, trying to do too much in one
method". Parsing a string into a type is a common, single idea).

Dictionary`2.TryGetValue, anyone?

Ben Voigt [C++ MVP] · Jun 16, 2007

Jon Skeet said:
Right, I 'm with you.

Hmm... that may be an idiom I've never come across. The first two make
absolute sense (even if they're fairly rarely required), but this one
is outside my experience.

Well, for example, Dictionary`2.TryGetValue has to return the same object
every time. It's not sufficient to copy the data into a caller-provided
object. There may be cases where the method has enough information to copy
(Dictionary couldn't even if it wanted to), but you want to share the same
object every time anyway. I think this is part of the Flyweight pattern.

Peter Duniho · Jun 16, 2007

But is that a reference type? Besides IPAddress.TryParse, they're all
structs.

Which are all structs? I know you're not claiming to have surveyed every
instance of a method named "TryParse" that anyone has ever written in C#.
And there's certainly no requirement that one limit the use of TryParse()
with structs.

I see no reason at all to restrict the TryParse semantics or similar to
structs. Just because in the .NET Framework itself they are (almost) all
structs, that's no reason to think you can't still do the same with
classes.

(Notice I said "usually bad, trying to do too much in one
method". Parsing a string into a type is a common, single idea).

You also sais "Please, show me where a ref-by-ref is needed in C#/.NET"
and I believe it was to that than Ben was responding. He did exactly as
you asked.

And that's fine. I'm just trying to show the OP that things are
different in C# than C/C++, and trying to use C-style code will lead
to a bad design.

Well, ignoring for the moment that your reply to Ben's post _appears_ to
me to indicate that it's not "fine"...

There's really nothing about this issue that distinguishes C++ from C#.
The same exact question regarding whether it's good to have by-reference
parameters exists in either language, and has the same arguments in favor
of and against.

Pete

Peter Duniho · Jun 16, 2007

Larry grasped it very well. But the correct answer, which I assumed in
the
OP, is not because C# references are the same as references in other
languages. Doing so is not allowed because the pointers, which are
allowed
to refer any areas of memory where no valid objects exist including
address
0, are not supported in C#.

That makes no sense at all. The question of "ref" parameters has exactly
zero to do with the implementation of references vs pointers. You are
trying to apply some nonsensical, irrelevant aspect of the data type
implementation in debating a higher-level language construct.

Pete

Ben Voigt [C++ MVP] · Jun 16, 2007

Peter Duniho said:
It's true. In C++ you can use casting to get around practically any
limitation intentionally put into the language.

There is no cast required.

int* p = nullptr;
int& i = *p;

I just put used cast syntax to avoid an additional line of code, the actual
conversion is implicit.

I don't see how that invalidates my point though. I have never written
C++ code that checks something passed by reference for null references,
nor have I ever had to maintain code written that way.

I never suggested that the reason to use references was to avoid having to
check for null pointers, but I do feel that is in fact an advantage (if
the caller wants to explicitly get around that safe-guard, that's a bug in
the caller akin to using Reflection in C# to get around a variety of
safe-guards C# puts into place), and I bristle at your implication (if not
outright accusation) that I am "misinformed" just for thinking so.

There's no explicitness in the getting around it. It can happen in
perfectly good-looking code. Thinking a reference is anything different
from syntactic sugar around a pointer *is* misinformed.

class X
{
public:
bool operator==(const X& other)
{
...
}
...
};

void use_it(T* p)
{
T t(...);
if (t == *p)
...
}

Clearly, if p is NULL, then operator== gets an invalid reference. There's
no sleight-of-hand necessary to accomplish it. Using the variable "other"
without testing if its address is NULL will have the exact same effect as
dereferencing p without checking if it is NULL. If you are using references
in your code as a means of documenting "NULL is not permitted for this
pointer", that's fine. But it is not a language restriction.

Ben Voigt [C++ MVP] · Jun 16, 2007

Larry Smith said:
Not only is it bad, it's undefined behaviour so it doesn't qualify as
valid C++. You can't legally dereference a null pointer nor is a null
reference legally possible.

Is creating a reference from a pointer defined as dereferencing that
pointer, even though no access through that pointer occurs until much later?

Ben Voigt [C++ MVP] · Jun 16, 2007

Jon Skeet said:
Do you remember how early on I asked you to describe what you were
trying to achieve *without referring to other languages*? I can't help
but feel this is still the only way forward in this discussion...

Here's the way to achieve it: Use a pointer, same as you would in every
other language to accomplish the same thing.

The problem isn't .NET's treatment of ref/out parameters, it is the
treatment of all pointers as unsafe. Creating a pointer is type-safe.
Dereferencing a pointer is type-safe. Comparing pointers is type-safe. The
only unsafe operation is incrementing (for any increment other than zero) a
pointer. And, it should be possible for highly trusted code to provide
bounded-pointer-iterators for arrays that can be used by safe code, and this
would also solve the problem of modifying struct members in-place in a
collection the same way it is possible with structs in arrays.

Jon Skeet [C# MVP] · Jun 16, 2007

Larry Smith said:
I completely disagree. There's nothing inherently ugly about it at all. It's
very clean in code and C/C++ developers have been using it for decades. Most
of the time you're dealing with one or two "out" parameters only so passing
"null" is a very quick and convenient (and consistent) way to tell any
function not to fill something in. It's highly legible (self-describing) and
no extraneous flags or properties are required to control what you want
(which means more work and more code). As for it being error-prone, that
argument is almost groundless in practice. It's no more error-prone than
passing null for a non-out parameter if the function isn't designed to
handle it. Any function should be prepared to handle null even if it means a
simple Assert at the very least. For an "out" parmeter, one simple check for
null is all that's required to make it optional. I've been in the coding
trenches for almost 25 years and can't recall anytime I had an issue with
this.

Well, I still think it smacks of being completely non-OO. The values
are clearly related, so encapsulating them into a single type makes
more idiomatic sense to me.

If you could change C# to accommodate this, how would you try to do so?
Checking the out parameter with:

if (outParam == null)

would be inconsistent in my view, as everywhere else in C# that would
be checking whether the *value* of outParam was null, not whether it
was an out parameter which didn't have anywhere to store its result.
I'd prefer a new keyword to using the above syntax - and I'd prefer not
having it at all to having a new keyword.

Jon Skeet [C# MVP] · Jun 16, 2007

Ben Voigt said:
Well, for example, Dictionary`2.TryGetValue has to return the same object
every time. It's not sufficient to copy the data into a caller-provided
object. There may be cases where the method has enough information to copy
(Dictionary couldn't even if it wanted to), but you want to share the same
object every time anyway. I think this is part of the Flyweight pattern.

Sort of - I certainly wouldn't have thought of it in terms of demanding
referential equality, just in terms of "returning an object" (yeah, a
reference really). But I'm glad I understand all your points now

Ben Voigt [C++ MVP] · Jun 16, 2007

Jon Skeet said:
Sort of - I certainly wouldn't have thought of it in terms of demanding
referential equality, just in terms of "returning an object" (yeah, a
reference really). But I'm glad I understand all your points now

Well, the alternative was, fill in the object the caller gave you by
reference... which is an anti-pattern to Flyweight.

Jon Skeet [C# MVP] · Jun 16, 2007

Ben Voigt said:
Well, the alternative was, fill in the object the caller gave you by
reference... which is an anti-pattern to Flyweight.

Sure - it's just not something which would normally have occurred to me
to even think of, and I just didn't get it from your original
description. No biggie.

Peter Duniho · Jun 16, 2007

There is no cast required.

int* p = nullptr;
int& i = *p;

I just put used cast syntax to avoid an additional line of code, the
actual conversion is implicit.

IMHO, your example is still contrived, and semantically the error is at
the point where you've dereferenced a null pointer, as opposed to using
the reference later. But more importantly, your example abandons the
specific context in which my statement was made.

My comment was strictly about the use of a null _explicitly_ as a
parameter. You seem to have taken that as an opening to infer all sorts
of other things that I never wrote, nor never intended anyone to infer.

There's no explicitness in the getting around it. It can happen in
perfectly good-looking code. Thinking a reference is anything different
from syntactic sugar around a pointer *is* misinformed.

Whatever. Personally, I find that reference parameters in C++ help a lot
in documenting what is expected of the caller.

[...]
void use_it(T* p)
{
T t(...);
if (t == *p)
...
}

Clearly, if p is NULL, then operator== gets an invalid reference.

Again, semantically it's my opinion that the error occurs at the point of
dereference. That is, "*p" is invalid itself.

Does the use of a reference parameter prevent that? No, I never meant to
imply that it did. But the fact is that the null dereference would be an
error even without the reference parameter. That is, it's not even
related to the use of the reference parameter.

All that the use of the reference parameter does is change when the null
pointer causes the error. But semantically, the error occurred outside
the function using the reference parameter, when the null pointer was
dereferenced.

Pete

Ben Voigt [C++ MVP] · Jun 16, 2007

In my opinion, the error is in use_it's caller passing a NULL parameter
against the documented contract of the function, or use_it's author failing
to document that requirement. My point was simply that using a reference
parameter is not significantly different from a pointer parameter as far as
NULL handling is concerned. The one is just syntactic sugar for the other.

Peter Duniho said:
void use_it(T* p)
{
T t(...);
if (t == *p)
...
}

Clearly, if p is NULL, then operator== gets an invalid reference.

Again, semantically it's my opinion that the error occurs at the point of
dereference. That is, "*p" is invalid itself.

Does the use of a reference parameter prevent that? No, I never meant to
imply that it did. But the fact is that the null dereference would be an
error even without the reference parameter. That is, it's not even
related to the use of the reference parameter.

All that the use of the reference parameter does is change when the null
pointer causes the error. But semantically, the error occurred outside
the function using the reference parameter, when the null pointer was
dereferenced.

Pete

Barry Kelly · Jun 17, 2007

Peter said:
That makes no sense at all.

Actually, it makes sense if you know what Valentin's trying to say - it
makes perfect sense to me, at least. Valentin is talking in the OP about
a certain construct - a "dereferenced null" - that can be passed in some
languages as an argument where the parameter is declared as
pass-by-reference. The result is that any attempt to read or write the
argument in the body of the method causes an AV or null reference
exception.

In the post that you've just replied to, he's talking about a property
of verifiable code running on the CLI virtual machine, memory safety
(i.e. incorrectly typed references are impossible). He's suggesting that
the correct answer to the reason why this "dereferenced null" concept is
missing from C# is because it violates the CLI's memory safety. This
isn't true: the CLI could easily implement it safely, however it would
definitely surprise everyone to add it at this late point. In any case,
it's not a big loss that it's a missing feature of the CLI or C#.

The question of "ref" parameters has exactly
zero to do with the implementation of references vs pointers. You are
trying to apply some nonsensical, irrelevant aspect of the data type
implementation in debating a higher-level language construct.

Peter, sometimes the people you reply to do have an idea that you
haven't considered yet!

-- Barry

Barry Kelly · Jun 17, 2007

Jon said:
Well, I still think it smacks of being completely non-OO.

It's orthogonal to OO, IMO, even if OO was an unalloyed virtue, which it
most certainly isn't, IMHO. OO include methods, so it basically includes
a good chunk of the whole of structured programming - for better or
worse.

The values
are clearly related, so encapsulating them into a single type makes
more idiomatic sense to me.

What if the values you're retrieving have different costs, and you don't
want to pay the whole cost, only the cost for the bits you want?

There are plenty of alternative designs, but I don't think any of them
match the C/C++ solution in clarity or conciseness.

Some off the top of my head:
* have many methods instead of one (API bloat)
* have an auxiliary flags argument that specifies the arguments you're
interested in
* return a proxy object which calculates and caches the values on demand
(implementation overhead, runtime overhead, API bloat)

If you could change C# to accommodate this, how would you try to do so?

That's a syntax issue - it's pretty irrelevant.

-- Barry

'null' references

Jon Skeet [C# MVP]

Jon Skeet [C# MVP]

Larry Smith

Barry Kelly

Barry Kelly

Ben Voigt [C++ MVP]

Ben Voigt [C++ MVP]

Peter Duniho

Peter Duniho

Ben Voigt [C++ MVP]

Ben Voigt [C++ MVP]

Ben Voigt [C++ MVP]

Jon Skeet [C# MVP]

Jon Skeet [C# MVP]

Ben Voigt [C++ MVP]

Jon Skeet [C# MVP]

Peter Duniho

Ben Voigt [C++ MVP]

Barry Kelly

Barry Kelly