Why does x.ToString() throw an error if x == null?

M

Michael C

If we have something like this

object x = null;
MessageBox.Show(x.ToString())

we get an error. That kindof makes sense but there is no reason the
behaviour couldn't be to return an empty string. When we call x.ToString we
are really calling a function like this:

class Object
{
public static string ToString(object* Instance)
{
//code to convert the object pointer to a string
}
}

Why can't the compiler just pass in Instance as a null and let the code sort
it out. If the code wants to throw an error then it can but if it wants to
do something more useful then it can. This would be useful in some cases (I
admit they are rare but the ToString example above could potentially be used
quite often). I can think of a few cases where this could be useful, eg:

string x = null;
MessageBox.Show(x.IsNullOrEmpty.ToString())
MessageBox.Show(x.Length.ToString()) //<--- shows zero
DateTime? y;
MessageBox.Show(y.IsValid)

The interesting thing is you can define extention methods that do work on
null objects, eg

static class Extensions
{
public static bool IsSomething(this string s)
{
return !string.IsNullOrEmpty(s);
}
public static string ToStringIgnoreNull(this object s)
{
if(s == null) return string.Empty;
return s.ToString();
}
}

Then use it like this:

string x = null;
if (x.IsSomething()) MessageBox.Show(x)

or

object x = null;
MessageBox.Show(x.ToStringIgnoreNull())

Any comments?

Cheers,
Michael
 
A

Alberto Poblacion

Michael C said:
[...] When we call x.ToString we are really calling a function like this:

class Object
{
public static string ToString(object* Instance)
{

No, not really. You can do that with extension methods (as you already
point out at the end of your message), but the real ToString is not static;
it is a virtual method that can be overridden in child classes:

class Object
{
public virtual string ToString()
{
return this.GetType().Name;
}
}

This can't work if the object is null.
 
J

jp2msft

Even if you declare an object, it doesn't really exist yet until you have
initialized it and the compiler has not set aside any space for it.

To say, "Look at the space for this object that I have not created yet and
tell me what it says" would naturally throw an error.

What might be possible (though I wouldn't know how to do it) would be to
define behaviors for your compiler. If x is null, then the compiler or
language could return something useful (instead of crashing or running off
into oblivion). God knows that happens to me plenty!
 
M

Michael C

jp2msft said:
Even if you declare an object, it doesn't really exist yet until you have
initialized it and the compiler has not set aside any space for it.

The bulk of the object does exist even before you declare the object.
Generally the actual code will be greater than the object itself (in size)
and all of the code exists and is ready to call before an instance is
created.
To say, "Look at the space for this object that I have not created yet and
tell me what it says" would naturally throw an error.

There is no reason calling a function on a null object *has* to raise an
exception, the designers of C# just designed it that way. They actually
needed to put an extra check in to stop it working. Basically when you call,
say, object.ToString you end up calling a global function ToString where the
pointer to the object is passed in, ie

x.ToString();

translates to (under the hood of course)

public static string ToString(object* pObject)
{
//convert pObject to sring
}

I'm sure it's more complicated than that but that's the basic idea. The
compiler does the null check before the static function is called but there
is no reason that it needs to. It can easily call the static ToString
function and let that function decide whether to allow nulls or not.

Michael
 
M

Michael C

Alberto Poblacion said:
No, not really. You can do that with extension methods (as you already
point out at the end of your message), but the real ToString is not
static; it is a virtual method that can be overridden in child classes:

Technically the real ToString is not static, it's global. But as C# doesn't
have global functions I wrote it as static.
class Object
{
public virtual string ToString()
{
return this.GetType().Name;
}
}

This can't work if the object is null.

Yes it can because the type tells you whether to call the original ToString
or the overridden ToString, eg

object x = null;
x.ToString(); //<-- calls the ToString defined by object

MyClass x = null;
x.ToString(); //<-- calls the ToString override on MyClass

being that the real to string function looks like this:

public static String ToString(MyClass* Value)
{
}

Then it is possible to call on a null reference becase null can be passed in
to the Value parameter.

Michael
 
M

Michael C

Peter Duniho said:
No, Object.ToString() is not static, and it's not global.

I'm loathe to reply to you peter as I find you can be quite rude (and
usually very negative and unpleasant) but here goes for now.

Actually that is wrong, every function under the hood is global. It has a
location in memory and any code in the current process can call that
function. The C# IDE tricks us into thinking it's instance or static or
private or whatever.
It's a virtual instance method in the Object class. Every type inherits
Object and so every object has the ToString() method. You are right that
C# doesn't have global functions, and that includes even
Object.ToString().

I'm talking about under the hood at the assembly level.
No, it can't. You cannot use any instance member of a class with a null
reference. In the case of ToString(), it's particularly problematic
because it's a virtual method and you need an instance to get at the
v-table. But even for non-virtual members the run-time requires an actual
instance.

There is no reason this can't be worked around.
For example, given that:

String x = "5";
Object obj = x;

obj.ToString();

winds up calling the String.ToString() implementation, what should happen
if you have:

String x = null;
Object obj = x;

obj.ToString();

Simple, it calls the Object.ToString.
To be consistent, it should call String.ToString(). But how is it to know
that should happen?

Obviously it should call Object.ToString.
The real ToString() method doesn't look anything like that.

Really? What does it look like then peter?

Michael
 
M

Michael C

Peter Duniho said:
That's only partly true. For virtual methods, it _does_ have to raise an
exception.

No, it can do whatever the designers decide it can do. They could have
called the ToString method if they had wanted to. Possibly there are
disadvantages but say it is not possible is just plain wrong.
That's just not true.

Oh yes it is (see, I can disagree with no detail too :)
There's no static function,

Static was not the best choice of words but I've already explained why I
used that term. I'd repeat my explanation but it appears you've missed it a
couple of times already.
and it's not the compiler generating the code to do the check. It's part
of the run-time.

Big deal, there's still a compiler and the compiler could do whatever the
designers want.
No, it can't.

Oh yes it can stoopid.
 
M

Michael C

Peter Duniho said:
You only find me "very negative and unpleasant" because you tend to post
things that don't make sense, which I then find myself correcting. It's
unfortunate that you don't handle disagreement and critique any better
than you do. Suffice to say, I don't go out of my way to upset you; you
just take it upon yourself to interpret things that way.

Possibly you are unaware of how you come across. This post is the perfect
example, I was just throwing out a suggestion that was kindof tongue in
cheek but you got out of your way to shoot the idea down. The funny thing is
there a real valid points you could have brought up you didn't and most of
what you did post was rubbish.
Witness the fact that in spite of me not writing anything significantly
different from what Alberto's written, you've already got that big chip
sitting up on your shoulder. You're just itching for a fight.

No, I'm just going by your past posts.
If that's true, then the term "global" is meaningless. If the word can't
be used to distinguish one kind of method from another, what's the point
of using it?

It's just a simple point I was making that under the hood every function is
a global function. The fact that you feel the need to argue even this says
something.
That said, using more widely accepted definitions of "global", it's not
true at all that "every function under the hood is global". The word
"global" refers to identifiers that require no qualification in order to
be resolved. It doesn't even make sense to talk about "global" with
respect to "under the hood", because "global" is an artifact of the
higher-level language being used.

You're arguing crap again peter. This is a minor issue not worth getting
into a debate about. In assembler all functions are global.
In C#, there are no globals. Period. Everything requires qualification,
being contained at a minimum inside a class that's inside a namespace.

Wow, who would have thought.
No. If you were talking about C++, that would not be far from the truth.
But for managed code, the run-time is a lot more involved, while the C#
compiler leaves these things to the run-time. The C# compiler isn't
"tricking us" about anything.

I suggest you take a look at the code generated by the C# compiler. You
would find it educational.

Many thanks peter, I will take that into account.
In C#? Yes, there is. The C# compiler has no control over the rules the
run-time imposes.

Whatever, MS could have done it.
Even more generally, you simply cannot call a virtual method without an
instance. You couldn't even do what you're talking about in plain,
unmanaged C++. If you still think you can call a virtual method without
an instance, I encourage you to post the C++ code that would do so.

If MS wanted to do it they could. They could just use the VTable for the
type.
Um, no. That's far from obvious. Even aside from the inconsistency issue
I pointed out, the fact is that you need to decide when you design the
language: are you going to use static typing or polymorphism to decide
which method to call?

If you use static typing, then the compiler gets to decide which method to
call, and that has to be based on the type of the variable used. That
means that in the first example I gave, "obj.ToString()" is going to call
Object.ToString() even though the type of the instance is actually
String. Whether you like it or not, that's completely contrary to the
whole point of making ToString() virtual in the first place. Which means
that the language might as well not have virtual members.
Rubbish.

Now, one can certainly design a language without virtual members. But
that's not C#, nor is it any other widely used OOP language. Virtual
members are a major component of what makes OOP so powerful.

That is rubbish. I don't even know where you come up with this stuff. Adding
this feature would not require removing virtuals.
So, maybe you decide you'd rather use polymorphism to decide which method
to call. That means that you need to look at the actual instance to
decide which method to call. But to do that, you need an instance and of
course instances exist only at run-time so the compiler cannot be involved
in making the decision at all.

So which is it going to be? Cripple the OOP language? Or disallow
calling of virtual methods with a null reference?
Rubbish.

Alberto already posted it. Was there something about that you had trouble
understanding?

I didn't think you would be able to answer than one.

Michael
 
M

Michael C

Peter Duniho said:
You should probably learn a little more about virtual methods (functions)
(particularly as they are implemented in langauges like C++, C#, VB.NET
and Java) before you make any more claims like that. Here's a decent
place to start:
http://en.wikipedia.org/wiki/Virtual_table

I know how virtual methods work.
No, it's exactly right. By definition, a virtual method needs an
instance. The run-time can't just go picking arbitrary methods to call
when there's a null reference, and the compiler doesn't emit enough
information for the run-time to know what the static typing of the
variable was.

For a null value they can whatever they (MS) like. They could have made it
work like an extension method. The funny thing I find is that you bring up a
load of crap that makes no sense and ignore the one big arguement I was
expecting. Basically implementing this feature would require every
underlying method to check for null which would be a big overhead when this
feature wouldn't be used all that often.
[...]
Oh yes it can stoopid.

Ah, right. I forgot that when you run out of words, you start calling
people names. That really helps you look credible.

You really don't have a clue how you come across do you peter? From your
previous posts I have absolutely zero respect for you. Your rather subtle in
attempting to belittle those you speak to when it comes to the crunch you
can claim ignorance. Take your very first response to this post above
suggesting I read wiki. This is a subtle way of saying "you're stupid" but
something you fully intended and something you do in most of your posts.
Perhaps you should go learn some people skills.

Michael
 
J

jp2msft

Dang guys! Be nice! Christmas is right around the corner and we all want him
to bring us more RAM and a Service Pack, don't we?

;)

Michael C said:
Peter Duniho said:
You should probably learn a little more about virtual methods (functions)
(particularly as they are implemented in langauges like C++, C#, VB.NET
and Java) before you make any more claims like that. Here's a decent
place to start:
http://en.wikipedia.org/wiki/Virtual_table

I know how virtual methods work.
No, it's exactly right. By definition, a virtual method needs an
instance. The run-time can't just go picking arbitrary methods to call
when there's a null reference, and the compiler doesn't emit enough
information for the run-time to know what the static typing of the
variable was.

For a null value they can whatever they (MS) like. They could have made it
work like an extension method. The funny thing I find is that you bring up a
load of crap that makes no sense and ignore the one big arguement I was
expecting. Basically implementing this feature would require every
underlying method to check for null which would be a big overhead when this
feature wouldn't be used all that often.
[...]
Oh yes it can stoopid.

Ah, right. I forgot that when you run out of words, you start calling
people names. That really helps you look credible.

You really don't have a clue how you come across do you peter? From your
previous posts I have absolutely zero respect for you. Your rather subtle in
attempting to belittle those you speak to when it comes to the crunch you
can claim ignorance. Take your very first response to this post above
suggesting I read wiki. This is a subtle way of saying "you're stupid" but
something you fully intended and something you do in most of your posts.
Perhaps you should go learn some people skills.

Michael
 
B

Ben Voigt [C++ MVP]

Michael said:
The bulk of the object does exist even before you declare the object.
Generally the actual code will be greater than the object itself (in
size) and all of the code exists and is ready to call before an
instance is created.


There is no reason calling a function on a null object *has* to raise
an exception, the designers of C# just designed it that way. They
actually needed to put an extra check in to stop it working.
Basically when you call, say, object.ToString you end up calling a
global function ToString where the pointer to the object is passed
in, ie
x.ToString();

translates to (under the hood of course)

public static string ToString(object* pObject)
{
//convert pObject to sring
}

I'm sure it's more complicated than that but that's the basic idea.

No, it's not really even all that close. And the devil is in the details.

The actual call x.ToString(), being virtual, ends up looking like it was
generated by something like this:

(*g_types[x->typetag].Methods['ToString'])(x);

unless the JIT hasn't compiled the MSIL into machine language yet, in which
case it does that before calling the function.

The actual function called depends on the actual run-time type of the
instance passed in. The run-time type is encoded in the first four bytes
(on x86) of each ref-typed object instance. The first four bytes of the
object instance referred to by null.... do you see the problem now? No
instance implies no type tag which implies no way to select the "right"
ToString implementation.
 
B

Ben Voigt [C++ MVP]

Oh, and just for grins, I can provide example code where x == null but
x.ToString() succeeds. Hint: !object.ReferenceEquals(x, null)
 
M

Michael C

I was going to reply to your individual replies but I thought 1 single
response would be better. The idea that you could call functions on a null
object was a kind of tongue in cheek "what if" kind of post. It was not
something I took too seriously myself and I know full well it's not going to
become a reality. An appropriate adult response would have been along the
lines of "I think it has some use but the disadvantages outweigh the
advantages". Your response was somewhat different and *clearly* deliberately
negative. The simple fact is it would be possible for MS to implement this
and it would not require removing virtual methods from the languages. It
would really not be implemented because every function would need to check
if a null reference had been passed in (imagine having to do if this == null
everywhere) and this massive amount of extra work would outweigh the minor
gains.

With regards to your attempting to belittle me (don't confuse your attempts
with me actually feeling belittled). I said you attempt to do this in a
subtle way so you can deny it when called on and it this is exactly what
you've done. Sending people wiki links is a subtle way of telling them they
are stupid and you know full well this is true. The funny thing is nothing I
have said has been wrong yet plenty you've posted has been. Your claim that
we need to remove virtuals is incorrect and many of your other reasons this
couldn't be implemented are incorrect. If MS wanted to do this they could.
You claimed that the ToString function does not look like what I posted
(with a pointer to this being passed in) but when asked to provide the real
function you dodged.

Michael
 
P

Pavel Minaev

I believe that this is a run-time rule.  You can't call an instance method  
on a null reference in C++/CLI either.

Not quite. The CLR allows one to call instance methods on null
receivers, so long as the call is not virtual (IL opcode "call" as
opposed to "callvirt"). The design decision for both C# and C++/CLI
was to use "callvirt" throughout, even for non-virtual methods,
precisely because it always does the null-check. But you can always
write your own IL which uses "call". Or you can use one of those
Delegate.CreateDelegate overloads which take an instance method and
expose its receiver as an explicit argument on the delegate, and then
pass null to that, in C# as well as any other .NET language

Of course, Object.ToString is virtual, so none of the above applies to
it.
 
P

Pavel Minaev

If we have something like this

object x = null;
MessageBox.Show(x.ToString())

we get an error. That kindof makes sense but there is no reason the
behaviour couldn't be to return an empty string.

Of all object-oriented languages with single dispatch I know of, the
only one which explicitly permitted null receivers for non-virtual
methods was Delphi. Even in C++, calling a non-virtual method on a
null pointer is U.B. (it _usually_ works on a typical implementation,
but relying on that is not a good idea). Why should C# be any
different?

By the way, RemObjects Oxygene has what you want, though it works in a
slightly different way - they have a colon-operator for method calls,
which works the same as dot, except that it returns null if the
receiver is null - i.e., for o==null, o.ToString is still an
exception, but o:ToString is null. The convenience of defining it thus
is that you can chain it - a:b:c:ToString - and know that you'll get
null if any of the subexpressions in the call chain yield null. This
is often handy when working with deep object graphs, XPath-style, and
would be convenient to have in C#. But it's still rather different
from what you propose.
 
M

Michael C

Pavel Minaev said:
This
is often handy when working with deep object graphs, XPath-style, and
would be convenient to have in C#. But it's still rather different
from what you propose.

Thanks for the reply. I think extension methods do what I want so something
built into the language specifically for this probably wouldn't be much
advantage. I'm not sure if it would replace your example of a:b:c etc but
for simple stuff like ToString it works quite well.

Michael
 
G

G.S.

Thanks for the reply. I think extension methods do what I want so something
built into the language specifically for this probably wouldn't be much
advantage. I'm not sure if it would replace your example of a:b:c etc but
for simple stuff like ToString it works quite well.

Michael


Quite a talk here!

Commenting on the original post, letting it slide and not throw the
error (yes, it can't be done because of the missing instance, the
vtable and the rest ot the Universe, and yes, Thee C# One could have
wired the runtime/compiler to let it slide) feels to me a bit like
option explicitless VB.
After all, doesn't it all boil down to being bug-free when x is null?
anyway... keep using up the Bandwidth. Young people need this kinda
talk otherwise they'll think the virtual table is a table with no
legs.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top