2.0 Generics - Feature or Bug?

Z

Zachary Turner

Consider the following generic class

class MyClass<T>
{
private MyInterface _SomeInterface;
private string _SomeFormatString;

public string GetSomeFormattedString()
{
T[] ArrayOfTs = _SomeInterface.GetArrayOfTs();
return String.Format(_SomeFormatString, ArrayOfTs);
}
}

Assume _SomeFormatString is a valid format string containing a number
of parameter replacement strings equal to the number of items in
ArrayOfTs.

Utilizing this class as follows:

MyClass<object> ObjectClass = new MyClass<object>();
string FormattedString = ObjectClass.GetSomeFormattedString();

throws an exception on the call to String.Format(). It happens because
the compiler invokes the wrong overload. It appears to invoke the
overload

String.Format(string Format, object arg)

instead of the correct one

String.Format(string Format, params[] object args)

Is this a feature, or a bug? I'm noticing more and more that the .NET
Generics engine seems to actually do very little at compile time, which
in my mind is almost defeating the purpose of Generics in the first
place.

I can work around the problem by calling

return String.Format(_SomeFormatString, (object[])ArrayOfTs);

or perhaps more robustly (in case T is something other than object) by
the rather obtuse code

object[] ConvertedValues = Array.ConvertAll<T, object>
(
Values,
delegate(T Thing) { return (object)Thing; }
);
return String.Format(_SomeFormatString, ConvertedValues);
 
O

Oliver Sturm

Hello Zachary,
throws an exception on the call to String.Format(). It happens because
the compiler invokes the wrong overload. It appears to invoke the
overload

String.Format(string Format, object arg)

instead of the correct one

String.Format(string Format, params[] object args)

I find this behaviour very logical - why do you think the second overload
is "the correct one"? I haven't checked, but I'm pretty sure if there was
this overload

String.Format(string Format, params T[] args)

it would be used. But there isn't, and so the next best choice is made.
I'm noticing more and more that the .NET
Generics engine seems to actually do very little at compile time, which
in my mind is almost defeating the purpose of Generics in the first
place.

I don't understand this statement. I think one major advantage of .NET
Generics over C++ Templates is that they are fully integrated with the
type system, as opposed to being some (pre-)compiler trick (sorry C++
people, I'm putting it this way to make a point). I don't see what your
statement has to do with the example above and I also don't think it's
correct.


Oliver Sturm
(Away over Christmas, sorry for delays in replies.)
 
Z

Zachary Turner

Oliver said:
Hello Zachary,
throws an exception on the call to String.Format(). It happens because
the compiler invokes the wrong overload. It appears to invoke the
overload

String.Format(string Format, object arg)

instead of the correct one

String.Format(string Format, params[] object args)

I find this behaviour very logical - why do you think the second overload
is "the correct one"? I haven't checked, but I'm pretty sure if there was
this overload

String.Format(string Format, params T[] args)

it would be used. But there isn't, and so the next best choice is made.
T isn't a type though, it's a placeholder for a type. object is a
type. Such an overload like you mention shouldn't be necessary,
because the compiler has all the information it needs before it decides
to emit any IL. It knows that I'm instantiating the class with
T=object, Therefore it should be able to use that information to make
a better decision about which overload to invoke. Again, I ask if this
is a feature or a bug because maybe this was a conscious decision on
the part of the language designers. If so however, I'm curious what
the motivation was. As a programmer, I don't like it when the language
dictates to me that I don't actually know what I'm doing, and tries to
make decisions for me based on what it thinks I intend.

I don't understand this statement. I think one major advantage of .NET
Generics over C++ Templates is that they are fully integrated with the
type system, as opposed to being some (pre-)compiler trick (sorry C++
people, I'm putting it this way to make a point). I don't see what your
statement has to do with the example above and I also don't think it's
correct.
It's related because just like in the above example, the compiler
should have had all the information it needed at compile time to deduce
the correct overload, yet it didn't. I can think of tons of similar
and equally aggraviting examples that I've encountered while doing .NET
programming.

Here's a simple problem: Write a generic class that operates on numbers
and contains one function to add two values together and return the
result, where all type-checking is done at compile time. Much to my
dismay, this is unsolvable with .NET. Here's what I would like to do:

class Adder<T>
{
public T Add(T Param1, T Param2)
{
return Param1 + Param2;
}
}

Again, I know what I'm doing. I don't plan to use this class with
anything that doesn't have a + operator. And if I mess up and do, why
not just generate a compiler error -then-? Why forbid legitimate
cases?

Here's another case that I spent hours trying to find a compile-time
solution to and failed. Write a generic method parameterized with a
primtive type T that takes a string and converts it to a first-class
object of type T. Again, can't be done, but here's what I'd like to
do:

class Converter
{
public static T Convert<T>(string Input)
{
return T.Parse(Input);
}
}

Why can't the compiler know whether or not I have a method called Parse
at compile time? I can understand if I used this class as follows:

Converter.Convert<typeof(SomeObject)>("47.5");

But I can't understand if I used the class as:

Converter.Convert<int>("47.5");

because its' right there, hardcoded. Again, I ask if it's a feature or
a bug because honestly I don't know. I have to assume it's by design,
but I'm curious as to the motivation, and I definitely want to sign the
petition to have the Generics support enhanced in a later iteration of
..NET.
 
J

Jon Skeet [C# MVP]

Zachary Turner said:
Consider the following generic class

class MyClass<T>
{
private MyInterface _SomeInterface;
private string _SomeFormatString;

public string GetSomeFormattedString()
{
T[] ArrayOfTs = _SomeInterface.GetArrayOfTs();
return String.Format(_SomeFormatString, ArrayOfTs);
}
}

Assume _SomeFormatString is a valid format string containing a number
of parameter replacement strings equal to the number of items in
ArrayOfTs.

Utilizing this class as follows:

MyClass<object> ObjectClass = new MyClass<object>();
string FormattedString = ObjectClass.GetSomeFormattedString();

throws an exception on the call to String.Format(). It happens because
the compiler invokes the wrong overload. It appears to invoke the
overload

String.Format(string Format, object arg)

instead of the correct one

String.Format(string Format, params[] object args)

The overload is decided at the compile time of MyClass<T>. It can't
call the latter method, as T might not be a reference type. For
instance, using MyClass<int>, you'd be trying to call a method
accepting object[] with an int[], which isn't valid.
 
Z

Zachary Turner

Jon said:
The overload is decided at the compile time of MyClass<T>. It can't
call the latter method, as T might not be a reference type. For
instance, using MyClass<int>, you'd be trying to call a method
accepting object[] with an int[], which isn't valid.

Right, I wouldn't expect that case to work in fact. But using
MyClass<object> doesn't even work. That's my main question.
 
J

Jon Skeet [C# MVP]

Zachary Turner said:
The overload is decided at the compile time of MyClass<T>. It can't
call the latter method, as T might not be a reference type. For
instance, using MyClass<int>, you'd be trying to call a method
accepting object[] with an int[], which isn't valid.

Right, I wouldn't expect that case to work in fact. But using
MyClass<object> doesn't even work. That's my main question.

But that's the point - the decision of which method to call is made at
the compile time of MyClass<T> when it doesn't know what T is - so it
*can't* choose the overload which takes object[].

Now, if you change MyClass<T> to add the generic constraint
"where T : class" then the compiler *can* guarantee that T[] is
compatible with object[], and it will pick the other overload.
 
V

Victor Rosenberg

Has Jon's suggestion helped, Zachary?

I am curios :)

Zachary Turner said:
The overload is decided at the compile time of MyClass<T>. It can't
call the latter method, as T might not be a reference type. For
instance, using MyClass<int>, you'd be trying to call a method
accepting object[] with an int[], which isn't valid.

Right, I wouldn't expect that case to work in fact. But using
MyClass<object> doesn't even work. That's my main question.

But that's the point - the decision of which method to call is made at
the compile time of MyClass<T> when it doesn't know what T is - so it
*can't* choose the overload which takes object[].

Now, if you change MyClass<T> to add the generic constraint
"where T : class" then the compiler *can* guarantee that T[] is
compatible with object[], and it will pick the other overload.
 
Z

Zachary Turner

It's given me something to think about, but I still haven't decided
whether or not I like it. Well, let me rephrase. I still feel .NET
Generics are lacking in that respect. Perhaps I just grew too attached
to the C++ method where the compiler actually generates new classes at
compile time and creates additional code for you. Whereas the .NET
paradigm seems to be that only one class is ever emitted into the IL.
In this respect, my suspicion all along was correct, in that .NET
Generics do very little at compile time. They don't, for example, emit
entirely new classes into the IL with the actual parameterized type
hardcoded into the generated class. I suspect there may be no (easy)
way around this since Generic class need to be able to be reflectable
just like anything else. On the other hand, I think there must be a
compromise hiding somewhere that still allows all the necessary
type-safety and reflection support, while still providing the extra
power of compile-time class generation. I don't know what that
compromise is, but I hope that the language designers find it someday.
Perhaps expanded support for more advanced where clauses.

I long for the day when this class compiles on types T with an inherent
or overloaded + operator.

class C<T>
{
public T Add(T t1, T t2)
{
return t1 + t2;
}
}

Victor said:
Has Jon's suggestion helped, Zachary?

I am curios :)

Zachary Turner said:
The overload is decided at the compile time of MyClass<T>. It can't
call the latter method, as T might not be a reference type. For
instance, using MyClass<int>, you'd be trying to call a method
accepting object[] with an int[], which isn't valid.

Right, I wouldn't expect that case to work in fact. But using
MyClass<object> doesn't even work. That's my main question.

But that's the point - the decision of which method to call is made at
the compile time of MyClass<T> when it doesn't know what T is - so it
*can't* choose the overload which takes object[].

Now, if you change MyClass<T> to add the generic constraint
"where T : class" then the compiler *can* guarantee that T[] is
compatible with object[], and it will pick the other overload.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top