C# Language Specification - Enumeration

G

Guest

Sure... IEnumerable was inconvenient suggesting a separate class to service
the enumeration, IEnumerator, and multiple operations: Current, MoveNext,
Reset. (I'll warp the definition of "operation" for a second if you don't
mind).

However, it existed within intuitive language semantics, whereas the new
"yield" keyword, while highly convenient, is also one of the most gross
warping of language concepts to date...

public IEnumerator GetEnumerator() {
foreach(foobar o in _co) yield return o;
}

So, it essentially redefines the concepts of how functions are supposed to
behave, as well as how return values interact with return types. It also
introduces the new keyword with a very specialized purpose.

One way to explain "yield", is to say that it boxes your object into an
IEnumerator, saves the execution state of your function, and also restores
the execution state the next time your function is called, even before the
keyword is invoked during regular instruction flow, so it establishes the
loop's next startup state. Therefore, it redefines the laws of instruction
sequencing, and is sort of a quantum keyword that exists in multiple spots at
once during the existance of that function.

It would make sense to have IEnumerate be an allowed combination of
IEnumerable and IEnumerator, and to add a GetNext function that can return
the proper generic variable type. Then make "yield" its own return-style
keyword:

(** Proposed Syntax:
public class Woo : IEnumerate
{
...
public object GetNext() {
if(_collection==null) return null;
foreach(object obj in _collection) {
yield obj;
}
}
...
}
**)

This would still allow the full existing enumeration capability, but would
introduce this new style in a cleaner and more intuitive way.
 
N

Nicholas Paldino [.NET/C# MVP]

Marshal,

The problem with condensing the two interfaces into one is that you end
up losing the fact that a type can say "I can have my contained items
iterated through, and this is the mechanism by which you do it". That's
different from your interface, which says "I can have my contained items
iterated through, and ^I^ will do it". The IEnumerable/IEnumerator
functions are better, IMO, because it will allow for better encapsulation of
code, at the least.

The other problem with your proposal is that you can't tell if there are
any more items in your iteration. Yeah, you could check against return
value of GetNext for null, but if you are enumerating value types, then you
can't do this, since they can't be null (unless you use Nullable types, but
that wasn't around in 1.0, and you would have to do this ugly case logic to
determine if it was a value type or not, and then if it was null).

Using your method would end up looking something like this:

// An object, to check for null.
object o = null;

// The type of the object.
T t;

// Loop. You have to check for null here since your implementation would
return
// null if there are no items in the enumeration. The assumption here
// is that at the end of the loop, you would return null.
while ((o = collection.GetNext()) != null)
{
// Cast to the original object.
t = (T) o;

// Do some stuff.
}

This is a little more messy than what you have to do now. First, you
need to have the object instance just to check null, and you have a nasty
assignment and comparison in the while statement. Compare that to how
foreach is expanded by the compiler using the current IEnumeration interface
(and assuming that collection implements IEnumeration):

// The type t.
T t;

while (!collection.MoveNext())
{
// Get the value.
t = collection.Current;
}

You have t, and you know if you hit the end, but without that ugliness
in between (the o variable checking for null).
 
G

Guest

The problem with condensing the two interfaces into one is that you end
up losing the fact that a type can say "I can have my contained items
iterated through, and this is the mechanism by which you do it". That's
different from your interface, which says "I can have my contained items
iterated through, and ^I^ will do it". The IEnumerable/IEnumerator
functions are better, IMO, because it will allow for better encapsulation of
code, at the least.

My proposal should be equivilent with the current proposals on using
"yield", it just cleans up the syntax. In both cases, you can still use
IEnumerable and IEnumerator for the reasons you point out, and both proposals
on "yield" are used for the ^I^ will do it case anyway.
The other problem with your proposal is that you can't tell if there are
any more items in your iteration...

Again, it's basically a cleaner way of expressing the existing yield syntax,
which has no problem terminating the enumeration in magic code. I was imaging
it being used in a foreach statement.

If the GetNext function is called directly as in your example, then I would
expect it should throw an exception if the caller tries to read beyond the
end of the array. They should write something like:

(** Proposed Syntax:
collection.Reset();
while(collection.HasValues) { o = collection.GetNext(); ... }
**)

However, they should be as unlikely to do that, as they would be to call
GetEnumerator directly in the current model which uses yield. If they did so
in the current model, it would not appear as intuitive.
 
N

Nicholas Paldino [.NET/C# MVP]

See inline:
Again, it's basically a cleaner way of expressing the existing yield
syntax,
which has no problem terminating the enumeration in magic code. I was
imaging
it being used in a foreach statement.

Ok, so it's cleaner, but at what cost? Clean doesn't mean anything if
actually implementing it is not.
If the GetNext function is called directly as in your example, then I
would
expect it should throw an exception if the caller tries to read beyond the
end of the array. They should write something like:

(** Proposed Syntax:
collection.Reset();
while(collection.HasValues) { o = collection.GetNext(); ... }
**)

There is a specific reason why the designers of .NET didn't use
something like HasValues and GetNext. The reason for that is that a good
number of people actually forget to code the call to move to the next item
in the enumeration, leading to infinite loops. With the way it is now (with
the all to MoveNext returning whether or not there are elements), you don't
have that problem.


--
- Nicholas Paldino [.NET/C# MVP]
- (e-mail address removed)
However, they should be as unlikely to do that, as they would be to call
GetEnumerator directly in the current model which uses yield. If they did
so
in the current model, it would not appear as intuitive.
 
H

Helge Jensen

Marshal said:
Sure... IEnumerable was inconvenient suggesting a separate class to service
the enumeration, IEnumerator, and multiple operations: Current, MoveNext,
Reset. (I'll warp the definition of "operation" for a second if you don't
mind).

IEnumerable expresses that enumeration is supported, "I am iterable, you
can obtain that enumeration by invoking GetEnumerator()".

IEnumerator expresses a "pointer" into an enumeration, "This is how far
you are along the enumeration".

IEnumerator have Current/MoveNext(), JAVA Iterator have hasNext() and
next(), C++ iterators have "++" and "*". Actually I don't recall seeing
any protocol for iteration that don't have seperate
advance/advance-check and value obtain, and thus multiple operations.

I would say that whether Reset() should be included in IEnumerator is
open to discussion.
So, it essentially redefines the concepts of how functions are supposed to
behave, as well as how return values interact with return types. It also
introduces the new keyword with a very specialized purpose.

It lets you define a generator for IEnumerator using function-syntax.
Other languages have this too, python and it is essential in ruby.

Whether this "redefines the concepts of how functions work" or not...
well, you are really just defining a function that returns an
enumerator. There is no "quantum magic" here, only compiler assisted
closure greneration.

Notice that the caller of the function doesn't care if actual "quantum
magic" was used to implement the IEnumerator returned. The implementer u
could do it by implementing the closure, MoveNext() and Current by hand
in a class that stores the relevant state and implement IEnumerator.

Function-syntax is rather nice for describing enumeration, and
syntactic-sugar support for storing the used local-variables in a
helper-class and implementing Current/MoveNext() is convinient to me in
that syntax.
One way to explain "yield", is to say that it boxes your object into an
IEnumerator, saves the execution state of your function, and also restores
the execution state the next time your function is called, even before the
keyword is invoked during regular instruction flow, so it establishes the
loop's next startup state. Therefore, it redefines the laws of instruction
sequencing, and is sort of a quantum keyword that exists in multiple spots at
once during the existance of that function.

Another way to explain it would be that it generates an IEnumerators
MoveNext() and Current functions from the closure available in the
function. This is a known technique known from functional programming: a
higher-order function.
It would make sense to have IEnumerate be an allowed combination of
IEnumerable and IEnumerator, and to add a GetNext function that can return
the proper generic variable type. Then make "yield" its own return-style
keyword:

I don't see the benefit of combining IEnumerable and IEnumerator, see above.

The problems in combining the two concepts is manifest in our
example-code. Where is _collection coming from (the interface-usage
prevents you from having a protocol of accepting it at construction time).
(** Proposed Syntax:
public class Woo : IEnumerate
{
...
public object GetNext() {
if(_collection==null) return null;
foreach(object obj in _collection) {
yield obj;
}
}
...
}
**)


I don't really understand how this is substantially different from or
better than:

public static IEnumerator Enumerate(ICollection c) {
foreach ( object o in c )
yield return o;
}

Which fits into the current enumeration-idiom in C#.

The choice of "yield return" vs. just "yield" is open to discussion, but
i suspect "yield return" was chosen because no program containing that
sequence of tokens would be valid in previous C# revisions.
This would still allow the full existing enumeration capability, but would
introduce this new style in a cleaner and more intuitive way.

Perhaps you are finding the IEnumerable/IEnumerator design
non-intuitive, and perhaps transfer that onto the "yield return" construct?

I find that the IEnumerable/IEnumerator design is spot-on and I haven't
really seen any languages which doesn't make that distinction in one
form or another.

Perhaps you need just a little more assimilation before you accept the
design ;) (notice smiley.... i'm not being hostile here)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top