Jon Skeet said:
Okay, I'll mail appropriate and post back with whatever I find
<snip>
Right - Eric Lippert has replied. It's probably worth reproducing the
question as I asked it, to make the reply make more sense. Be warned,
it's a long and detailed answer - just the kind I like
Question:
I've been aware of a few differences between what the C# spec claims is
allowed and what the CLR allows when it comes to conversions. However,
I think this is a new one on me:
using System;
class Test
{
static void Main(string[] args)
{
Array ints = new int[] {1, 2, 3};
uint[] uints = (uint[]) ints;
Console.WriteLine(uints.GetType());
}
}
This runs with no exceptions, and produces the following output:
System.Int32[]
(Interestingly, if you box an element of "uints" that *does* get boxed
as a uint, because it's the compiler which specifies the type there.)
The relevant bits are the C# spec is 6.2.4:
<quote>For an explicit reference conversion to succeed at run-time, the
value of the source operand must be null, or the actual type of the
object referenced by the source operand must be a type that can be
converted to the destination type by an implicit reference conversion
(§6.1.6). If an explicit reference conversion fails, a
System.InvalidCastException is thrown.
</quote>
Now, there's no implicit reference conversion from int[] to uint[], so
already the spec has been violated. However, I'd expected to see why
the CLR allowed this in ECMA-335 - so we move on to partition 3, 4.3,
the castclass instruction (as emitted by the C# compiler):
<quote>
Note that:
1. Arrays inherit from System.Array.
2. If Foo can be cast to Bar, then Foo[] can be cast to Bar[].
3. For the purposes of note 2 above, enums are treated as their
underlying type: thus E1[] can be cast
to E2[] if E1 and E2 share an underlying type.
</quote>
Now, this would see to make a certain amount of sense, if we deem that
"int can be cast to uint" - except that I'm not sure of the use of the
word "cast" here. It certainly doesn't work for everything: you can
cast an int to a float, or a long, or a byte - but you can't convert
int[] to float[], long[] or byte[].
So, to sum up:
1) Assuming the C# spec wants to leave a little wiggle room for the
CLR, I don't think it should be quite so prescriptive - it would be
worth mentioning that the runtime may make some extra conversions
available. It's a bit of a shame to have areas of uncertainty like this
(we certainly wouldn't want the CLR to start converting completely
unrelated types, for instance) but I understand there's a small matter
of pragmatism.
2) It looks to me like either ECMA-335 is poorly worded, and/or the CLR
is violating it.
Anyone care to enlighten me?
------------------- End of question ---------------------
Eric's response:
Jon=3Fs analysis is pretty much correct; there are conversions which the
CLR allows which C# does not. Because of that, if you hammer on it hard
enough, you can make a C# program which you would think ought to throw
an invalid cast exception, but in fact succeeds.
However, though I hate to be contradictory, I must point out that this
is not correct.
you can cast an int to a float, or a long, or a byte
It is not legal to issue a castclass instruction from int to float.
Remember, you are talking about the CLR definition of =3Fcast=3F here, not
the C# definition, so do not confuse the two.
The confusion is our fault, in two ways.
First source of confusion: in C# we have conflated two completely
different operations as =3Fcast=3F operations. The two operations that we
have conflated are what the CLR calls casts and coercions.
8.3.2 Coercion
Sometimes it is desirable to take a value of a type that is not
assignment-compatible with a location, and convert the value to a
type that is assignment-compatible. This is accomplished through
coercion of the value.
Coercion takes a value of a particular type and a desired type and
attempts to create a value of the desired type that has equivalent
meaning to the original value. Coercion can result in
representation changes as well as type changes; hence coercion does
not necessarily preserve the identity of two objects.
There are two kinds of coercion: widening, which never loses
information, and narrowing, in which information might be lost. An
example of a widening coercion would be coercing a value that is a
32-bit signed integer to a value that is a 64-bit signed integer.
An example of a narrowing coercion is the reverse: coercing a 64-
bit signed integer to a 32-bit signed integer. Programming
languages often implement widening coercions as implicit
conversions, whereas narrowing coercions usually require an
explicit conversion.
Some widening coercion is built directly into the VES operations on
the built-in types (see §12.1). All other coercion shall be
explicitly requested. For the built-in types, the CTS provides
operations to perform widening coercions with no runtime checks and
narrowing coercions with runtime checks.
8.3.3 Casting
Since a value can be of more than one type, a use of the value
needs to clearly identify which of its types is being used. Since
values are read from locations that are typed, the type of the
value which is used is the type of the location from which the
value was read. If a different type is to be used, the value is
cast to one of its other types. Casting is usually a compile time
operation, but if the compiler cannot statically know that the
value is of the target type, a runtime cast check is done. Unlike
coercion, a cast never changes the actual type of an object nor
does it change the representation. Casting preserves the identity
of objects.
For example, a runtime check might be needed when casting a value
read from a location that is typed as holding a value of a
particular interface. Since an interface is an incomplete
description of the value, casting that value to be of a different
interface type will usually result in a runtime cast check.
We conflate these two things in C#, using the same operator syntax and
terminology for both casts and coercions.
So now it should be clear that there is no =3Fcast=3F from int to floatin
the CLR. That=3Fs a coercion, not a cast.
Second source of confusion: inconsistency in the CLR spec.
The CLR spec says in section 8.7
Signed and unsigned integral primitive types can be assigned to
each other; e.g., int8 := uint8 is valid. For this purpose, bool
shall be considered compatible with uint8 and vice versa, which
makes bool := uint8 valid, and vice versa. This is also true for
arrays of signed and unsigned integral primitive types of the same
size; e.g., int32[] := uint32[] is valid.
And in section 4.3:
If the class of the object on the top of the stack does not
implement class (if class is an interface), and is not a derived
class of class (if class is a regular class), then an
InvalidCastException is thrown.
=3F
2. If Foo can be cast to Bar, then Foo[] can be cast to Bar[].
Where does the spec for castclass say that int32[] can be cast to
uint32[]? It doesn=3Ft. It should! int32 and uint32 are assignment
compatible, so they can be cast from one to the other without changing
bits. But they do not implement or derive from each other, so a strict
reading of the spec says that this cast should fail, and therefore
int32[] to uint32[] should also fail.
Clearly that is not what was meant and not what was implemented.
Casting between assignment-compatible types should be legal. Really
what this should say is something like =3FIf Foo can be cast to Bar or
Foo is assignment compatible with Bar then Foo[] can be cast to Bar[]=3F
Fortunately, the CLR guys did NOT extend this goofy kind of type
variance to covariant and contravariant interfaces, which as you know
we are probably adding in a future version of C#. That is, if we make
IEnumerable<T> covariant in T, it will NOT be possible to do a clever
series of casts to trick the CLR into assigning an IEnumerable<int> to
an IEnumerable<uint>, even though it is possible to make int[] go to
uint[]. However, I think it is possible =3F I haven=3Ft checked this yet =3F
to leverage the fact that int[] goes to uint[] to similarly force
IEnumerable<int[]> to go to IEnumerable<uint[]>.
This situation =3F the CLR being more generous about what identity-
preserving casts are legal =3F may end up considerably complicating my
life in other ways involving covariance and contravariance as we
attempt to detect ambiguous conversions at compile time, but that=3Fs
another story and we are still researching it.
------------------- End of answer ---------------------
I'd just like to thank Eric for posting such a complete and
illuminating answer.