Introducing generics in API design may stem from lack of co- and contravariance. Asking for advice o

A

Anders Borum

Hello,

I've worked on an API for quite some time and have (on several occasions)
tried to introduce generics at the core abstract level of business objects
(especially a hierarchical node). The current non-generic implementation is
functional, but not as clean as I would like. Although not sure, I believe
my problems stem from lacking support of co- and contravariance in C# (which
I'm desperately hoping will make it in the next version).

The reason of this post is to ask for your feedback (i.e. if my assumptions
are right or wrong) and hopefully get some directions. Right now it feels
like I'm working against the compiler, but the API design looks clean to me.
However, if I'm trying to do the impossible, it would obviously be an
important lesson and keep working with the non-generic version of the API
(and refactor if or when C# can support the requirements).

// Start of examples

I would like to "say" the following in C#:

1. Define a generic hierarchical type (HierarchyNode<T>) that allows a
descending class (Page) to implement coveriant return types. As seen below,
the class Page defines itself "as a" HierarchyNode<Page>, but can't
implement the abstract members - even though the return type in fact "is a"
HierarchyNode<Page>. It's the same situation with
NodeCollection<HierarchyNode<T>>> that wants to be implemented using a
covariant NodeCollection<Page> return type.

abstract class Node {}

abstract class HierarchyNode<T> : Node where T : Node
{
// if this property is implemented as T instead of HierarchyNode<T>, the
// structure is not hierarchical (i.e. HierarchyNode<T>.Parent.Parent is
not possible
// because T is constrained to CmsNode.

// abstract T Parent { get; }
// abstract NodeCollection<T> Children { get; }

abstract HierarchyNode<T> Parent { get; }
abstract NodeCollection<HierarchyNode<T>> Children { get; }
}

class NodeCollection<T> where T : Node {}

// a concrete hierarchical business object
class Page : HierarchyNode<Page>
{
override Page Parent
{
get { return new Page(); }
}

override NodeCollection<Page> Children
{
get { return new NodeCollection<Page>(); }
}
}

The following compiler error is raised when compiling the snippet:
Error 1 'Page' does not implement inherited abstract member
'HierarchyNode<Page>.ChildNodes.get' 16 15 Generics.

Please note that I have tried the following signature also:

abstract class HierarchyNode<T> : Node where T : HierarchyNode<T>
{
abstract T Parent { get; }
abstract NodeCollection<T> Children { get; }
}

but it makes it impossible to test if a Node "is a" HierarchicalNode<T>
because T is never convertiable to Node.

void ProcessNode<T>(T node) where T : CmsNode
{
// does not compile (which construct should be used here?)
if (node is HierarchyNode<?>)
{
// process node
}
}


2. Cast a type to a generic type definition. Because the HierarchyNode<T> is
generic, it's impossible to actually test whether a Node in fact "is a"
HierarchyNode<T> unless you know the exact type of T and that might not be
available (i.e. requires a generic method). The following snippet is an
example of a common pattern implemented in a method that takes a Node as
argument. I realize there's a potential way around this situation -
providing two different methods; one taking the Node and another with a
generic type (ProcessNode<T>(HierarchyNode<T> node) where T : Node (but
that's convoluted - and left out of the example).

void ProcessNode(Node node)
{
// process node

if (node is HierarchyNode<Node>)
{
// should process node as a hierarchical structure
// but never happens.

foreach (HierarchyNode<Node> child in ((HierarchyNode<Node>)
node.ChildNodes)
{
// should process child as a hierarchical structure
// but throws a compiler error.
}

}
}

// just call the worker method with an instance of a Node (should not care
whether it's hierarchical or not)
ProcessNode(new Page());

The following compiler error is raised when compiling the snippet:
Error 1 Cannot convert type 'HierarchyNode<Page>' to 'HierarchyNode<Node>'
75 4 Generics

// End of examples


Over the couse of the past few weeks I've tried many different approaches at
implementing the generic and concrete classes above, so that they allow
casting to and from generic versions etc. but constantly find myself
cornered by a compiler errors (or class design I'd rather not live with).

Am I simply working in a corner of C# where variance is not yet as evolved?
I've thought about seperating the hierarchy completely from the structure,
but it didn't really fit well with the design of the business objects and as
far as I know I would still face some of the problems - namely conversion
problems between generic types.

Right now I've implemented the concrete API using abstract non-generic
classes (also collections) and hiding inherited members in concrete classes
(such as Page and PageCollection). I'd much rather skip the method hiding
and provide a single generic collection of T, but the trouble I'm facing
with generics (and I've really tried my best) simply made me give up.

Perhaps generics was not intended for this scenario - which is sad, because
it would enable a wide range of oppertunities.

Thanks for reading!

With regards
Anders Borum / SphereWorks
Microsoft Certified Professional (.NET MCP)
 
P

Pavel Minaev

Anders Borum said:
Please note that I have tried the following signature also:

abstract class HierarchyNode<T> : Node where T : HierarchyNode<T>
{
abstract T Parent { get; }
abstract NodeCollection<T> Children { get; }
}

This doesn't look too good, because it effectively requires the type of
parent of any T to also be T, and all children to be T as well. In other
words, it requires the tree to be homogenous - if T is Page, then its parent
would have to be Page, and all children, too. I had the impression that it's
not what you want. Shouldn't it just be "Node Parent" and
NodeCollection said:
but it makes it impossible to test if a Node "is a" HierarchicalNode<T>
because T is never convertiable to Node.

void ProcessNode<T>(T node) where T : CmsNode
{
// does not compile (which construct should be used here?)
if (node is HierarchyNode<?>)
{
// process node
}
}

If you need to distinguish two cases, then perhaps an overload would do
better here:

void ProcessNode<T>(T node) where T : Node;

void ProcessNode said:
2. Cast a type to a generic type definition. Because the HierarchyNode<T>
is
generic, it's impossible to actually test whether a Node in fact "is a"
HierarchyNode<T> unless you know the exact type of T and that might not be
available (i.e. requires a generic method).

It actually makes sense. Even if you would be able to somehow test that T is
"some kind of HierarchyNode", you wouldn't be able to use the result of such
cast anyway, since all members of HierarchyNode in your definition depend on
T. If there are any that don't, then consider refactoring them into a
separate non-generic abstract class, and deriving HierarchyNode<T> from
that.
 
A

Anders Borum

Hi Pavel

Thanks for the feedback so far.
This doesn't look too good, because it effectively requires the type of
parent of any T to also be T, and all children to be T as well. In other
words, it requires the tree to be homogenous - if T is Page, then its
parent would have to be Page, and all children, too. I had the impression
that it's not what you want. Shouldn't it just be "Node Parent" and
"NodeCollection<Node> Children"? Of course, I may be wrong here...

Actually that is what I was after with the design. That a given type can
inherit a hierarchical representation and make itself the type of parent /
child nodes. The problem is that using "public T Parent" (instead of "public
HierarchyNode<T> Parent") makes it impossible for generic methods to
actually use the hierarchy (they only see T, not HierarchyNode<T> - thus
node.Parent yield T, not allowing node.Parent.Parent).

It would be nice to constrain T to HierarchyNode said:
If you need to distinguish two cases, then perhaps an overload would do
better here:

void ProcessNode<T>(T node) where T : Node;
void ProcessNode<T>(HierarchyNode<T> node) where T : HierarchyNode<T>;

It actually makes sense. Even if you would be able to somehow test that T
is "some kind of HierarchyNode", you wouldn't be able to use the result of
such cast anyway, since all members of HierarchyNode in your definition
depend on T. If there are any that don't, then consider refactoring them
into a separate non-generic abstract class, and deriving HierarchyNode<T>
from that.

This signature "HierarchyNode<T> : Node where T : Node" illustrates that
it's fair to conclude that Node potentially could be a HierarchyNode<Node>,
but alas, the following generic actually works.

void ProcessNode<T>(T node) where T : CmsNode
{
// does not compile (which construct should be used here?)
if (node is HierarchyNode<T>)
{
HierarchyNode<T> hierarchy = node as HierarchyNode<T>;
// process node
}
}
 
P

Pavel Minaev

Anders Borum said:
That would make calling ProcessNode<T>(T node) where T : Node messy

Why? If a specific T is just a Node, then that's the overload that will get
called.
This signature "HierarchyNode<T> : Node where T : Node" illustrates that
it's fair to conclude that Node potentially could be a
HierarchyNode<Node>,

No, because a given T:Node is T:HierarchyNode<T>, which is not the same as
T:HierarchyNode<Node>. Essentially, it boils down to the fact that
HierarchyNode<Derived> cannot be upcast to HierarchyNode<Base>. Which,
given your definition for HierarchyNode, is entirely correct, since such an
upcast would break the type system (similar to List<Base> and
List said:
but alas, the following generic actually works.
void ProcessNode<T>(T node) where T : CmsNode
{
// does not compile (which construct should be used here?)
if (node is HierarchyNode<T>)
{
HierarchyNode<T> hierarchy = node as HierarchyNode<T>;
// process node
}
}

It's easy to break this. I assume that you have CmsNode defined thus:

class CmsNode : HierarchyNode<CmsNode> { ... }

Now imagine that you derive further:

class OtherNode : CmsNode { ... }

Now we try to pass OtherNode to our ProcessNode<T>(). T is inferred as
OtherNode , and instantiation will look like this:

void ProcessNode(OtherNode node)
{
if (node is HierarchyNode<OtherNode >)
{
HierarchyNode<OtherNode > hierarchy = node as
HierarchyNode<OtherNode >;
// process node
}
}

In terms of static type checking, everything is fine. The problem is that
OtherNode does not extend HierarchyNode<OtherNode> - it extends (indirectly)
HierarchyNode<CmsNode>, which is not checked here. So it will not pass the
check, even though it's a HierarchyNode...

Yes, this is indeed a variance-related deficiency of C#. It also arises in
similar cases with some BCL types (IEquatable<T> and IComparable<T>, to name
some). Unfortunately, what they propose for C# 4.0 and VB10 won't solve this
problem - I've discussed it elsewhere:

http://msmvps.com/blogs/bill/archive/2008/08/11/generic-variance-part1-do-you-really-need-it.aspx
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top