IEnumerable is strange

T

Tin Gherdanarra

Dear mpdls,

here is a simple example of an IEnumerable that
generates integers:

It works, but I have only a vague idea of what's
going on. I understand that /yield/ wraps the
humble integer that comes from

counter++

into an IEnumerator<int>.

I don't understand why I have to implement TWO
GetEnumerator()-methods that both do the same and
only have different return types. Neither do I
understand which one is picked by /foreach/.

If I don't implement both GetEnumerator()s,
I get errors of omission.

IEnumerator is something like a /Generator/, right?
By the way, where is comp.lang.csharp?

Thanks
Tin



using System;
using System.Collections.Generic;
using System.Text;

namespace Riba
{
public class NumberEnumerable : IEnumerable<int>
{
private int counter = 0;
public IEnumerator<int> Core()
{

for (; ; )
{
yield return counter++;
}
}

System.Collections.IEnumerator
System.Collections.IEnumerable.GetEnumerator()
{
return Core();
}

public IEnumerator<int> GetEnumerator()
{
return Core();
}
}


class Program
{
static void Main(string[] args)
{
foreach (int i in new NumberEnumerable())
{
System.Console.WriteLine("{0}", i);
}
}
}
}
 
N

Nicholas Paldino [.NET/C# MVP]

Tin,

See inline:


Tin Gherdanarra said:
Dear mpdls,

here is a simple example of an IEnumerable that
generates integers:

It works, but I have only a vague idea of what's
going on. I understand that /yield/ wraps the
humble integer that comes from

counter++

into an IEnumerator<int>.

I don't understand why I have to implement TWO
GetEnumerator()-methods that both do the same and
only have different return types. Neither do I
understand which one is picked by /foreach/.

I agree with you on the frustration of having to implement two
GetEnumerator methods. The reason is because IEnumerable<T> derives from
IEnumerable. Because they both have one method (GetEnumerator) with the
same signatures, you have to have the strange implementation which requires
one explicit/one implicit, or two explicit ones.
If I don't implement both GetEnumerator()s,
I get errors of omission.

IEnumerator is something like a /Generator/, right?
By the way, where is comp.lang.csharp?

I don't believe there is one.


--
- Nicholas Paldino [.NET/C# MVP]
- (e-mail address removed)
Thanks
Tin



using System;
using System.Collections.Generic;
using System.Text;

namespace Riba
{
public class NumberEnumerable : IEnumerable<int>
{
private int counter = 0;
public IEnumerator<int> Core()
{

for (; ; )
{
yield return counter++;
}
}

System.Collections.IEnumerator
System.Collections.IEnumerable.GetEnumerator()
{
return Core();
}

public IEnumerator<int> GetEnumerator()
{
return Core();
}
}


class Program
{
static void Main(string[] args)
{
foreach (int i in new NumberEnumerable())
{
System.Console.WriteLine("{0}", i);
}
}
}
}
 
B

Barry Kelly

Tin Gherdanarra said:
It works, but I have only a vague idea of what's
going on. I understand that /yield/ wraps the
humble integer that comes from

counter++

into an IEnumerator<int>.

I don't understand why I have to implement TWO
GetEnumerator()-methods that both do the same and
only have different return types.

It's for compatibility reasons, but it's also useful because covariance
of generic types isn't supported by the .NET framework. For example, if
you want to write a method which takes in an IEnumerable but don't want
or need the method to be generic, then you'd probably like to write
IEnumerable<object> and have IEnumerable<int> convert to that type. That
doesn't work in C#. Instead, we've got the old IEnumerable.

It's easy to implement the IEnumerator version - simply return the
IEnumerator said:
Neither do I
understand which one is picked by /foreach/.

If the type of the expression to the right of the 'in' statically(*)
supports IEnumerable<T>, and the iteration variable of the foreach is
assignment compatible with this type T, then the foreach will use the
statically typed version. If the type statically supports IEnumerable<T>
and the iteration variable is *not* assignment compatible, then an error
will occur. Otherwise, the type must statically support IEnumerable, and
a runtime cast will be generated for each iteration.

(*) Statically in this case refers to the *compile-time* type of the
expression to the right of the 'in' in the 'foreach' statement.
If I don't implement both GetEnumerator()s,
I get errors of omission.

IEnumerator is something like a /Generator/, right?

It's an implementation of CLU's iterator feature, but generator is
another name for it.
using System;
using System.Collections.Generic;
using System.Text;

namespace Riba
{
public class NumberEnumerable : IEnumerable<int>
{
private int counter = 0;
public IEnumerator<int> Core()
{

for (; ; )
{
yield return counter++;
}
}
Note that your counter variable can be local, it doesn't need to be a
field.
System.Collections.IEnumerator
System.Collections.IEnumerable.GetEnumerator()
{
return Core();

You don't need to factor the Core() method at all. You can simply return
GetEnumerator() here and it will return the version returning
since that one isn't an explicit interface said:
}

public IEnumerator<int> GetEnumerator()
{
return Core();

}
}

class Program
{
static void Main(string[] args)
{
foreach (int i in new NumberEnumerable())
{
System.Console.WriteLine("{0}", i);
}
}
}
}

Since you're only using the class NumberEnumerable() to implement an
enumerator, you don't need to define a separate class at all. Iterators
can also return IEnumerable<T>:

---8<---
using System;
using System.Collections.Generic;

class Program
{
static IEnumerable<int> NumberEnumerable()
{
int i = 0;
for (;;)
yield return i++;
}

static void Main(string[] args)
{
foreach (int i in NumberEnumerable())
{
System.Console.WriteLine("{0}", i);
}
}
}
--->8---

-- Barry
 
T

Tin Gherdanarra

Thanks for the elaborate reply, but maybe IEnumerable
is the wrong thing here anyway. You reply:
It's an implementation of CLU's iterator feature, but generator is
another name for it.

A big advantage of generators is that you can recycle
them and use them in the fashion of command-line pipes,
like this:

RandomNumberGenerator | FilterPrimeNumbersGenerator | MultiplyBy4...

I assumed I can do the same with IEnumerable, but it
seems that it is a lot more tricky than I thought.

The key is that I (wrongly) guessed that /yield/ is just a mechanism
for marking the spot where a function returns this time
and will continue next time. /foreach/ simply calls the
function embodying the /yield/. This can't be true, I guess.
Here is an illustration:

int foo()
{
int counter = 0;
yield counter++;
}

Multiple clients can call foo, and the
spot where to continue has to be remembered for each
new caller. This means that the first call to /foo/
somehow instantiates a new environment for that
particular foo(), complete with its own version
of /counter/ (right?)
In other words, each use of /foreach/ causes
foo() to keep its state information for each
client (right?)

What's more, /foreach/ does not seem to merely /call/
foo(), it does something else, because NOT using
/foreach/ but calling foo() explicitly resets
the counter for every call. Here is an example
for two generators/iterators. I want to pipe the
output of the number-generator into a generator
that packs them up into arrays of three numbers
each.

0 1 2 3 4 5 6 7 8 ... from NumberEnumerable

is fed into /ThreeNumbersEnumerable/ and
spits

{0 1 2} {3 4 5} {6 7 8} ...


Here is my naive implementation that does not
work:


namespace TestEnumeration
{
public class NumberEnumerable
{

// So far so good, spitting numbers
public IEnumerator<int> GetEnumerator()
{
int counter = 0;
for (; ; )
{
yield return counter++;
}
}
}

// The packer...
public class ThreeNumberEnumerable
{
int[] three = new int[3];
NumberEnumerable nen = new NumberEnumerable();

public IEnumerator<int[]> GetEnumerator()
{
for(;;)
{
for(int i = 0; i < 3; i++)
{
// Let's call GetEnumerator!
// Let's get numbers!
three = nen.GetEnumerator().Current;
nen.GetEnumerator().MoveNext();
}

// Return the next 3-pack
yield return three;
}
}
}

class Program
{
static void Main(string[] args)
{
ThreeNumberEnumerable lb = new ThreeNumberEnumerable();

// DISAPPOINTMENT!
// We get packs of {0 0 0} -- the counter
// is reset each time

foreach (int[] i3 in lb)
{
Console.WriteLine("{0} {1} {2}", i3[0], i3[1], i3[2]);
}
}
}
}

All this means I can't CALL nen.GetEnumerator(), I have
to do something else, but what? Am I supposed to do it
this way in first place?

Kind regards
Tin
 
B

Barry Kelly

Tin Gherdanarra said:
A big advantage of generators is that you can recycle
them and use them in the fashion of command-line pipes,
like this:

RandomNumberGenerator | FilterPrimeNumbersGenerator | MultiplyBy4...

I assumed I can do the same with IEnumerable, but it
seems that it is a lot more tricky than I thought.

You can do this easily. It's the basis of LINQ in C# 3.0. It's trivial:

---8<---
IEnumerable<T> Filter<T>(IEnumerable<T> source, Predicate<T> pred)
{
foreach (T item in source)
if (pred(item))
yield return item;
}
--->8---

The LINQ extension method "Where" (System.Query.Sequence.Where()) looks
like the above.
The key is that I (wrongly) guessed that /yield/ is just a mechanism
for marking the spot where a function returns this time
and will continue next time.

You were right when you guessed. That's exactly what 'yield return' is:
the point where execution will resume when MoveNext() is called on the
enumerator.
/foreach/ simply calls the
function embodying the /yield/. This can't be true, I guess.
Here is an illustration:

int foo()
{
int counter = 0;
yield counter++;
}

Multiple clients can call foo, and the
spot where to continue has to be remembered for each
new caller. This means that the first call to /foo/
somehow instantiates a new environment for that
particular foo(), complete with its own version
of /counter/ (right?)

I think you need to download .NET Reflector and investigate a class
which implements an iterator, and similarly investigate how foreach is
implemented. It would take too long to explain in full here.
In other words, each use of /foreach/ causes
foo() to keep its state information for each
client (right?)

When foreach is first entered, GetEnumerator is called on the enumerable
object and a *new* object implementing IEnumerator[<T>] is returned.
MoveNext() is called once per iteration, and if it ever returns false,
the loop is exited. The value of the Current property is used to
initialize the iteration variable.

The function implementing the iterator is technically never entered.
It's rewritten into a completely different method which lives in a
different class, automatically created by the compiler. Check it out
with .NET Reflector.

One fundamental piece that you seem to be missing is that the local
variables in the iterator definition do *not* lose their values when the
function returns. In particular, for iterators returning IEnumerable
rather than IEnumerator, you should use local variables for iteration
storage, not fields.
I want to pipe the
output of the number-generator into a generator
that packs them up into arrays of three numbers
each.

Here's one solution to that problem:

---8<---
using System;
using System.Collections.Generic;

class App
{
static IEnumerable<int> Count()
{
int i = 0;
for (;;)
yield return i++;
}

static IEnumerable<T[]> SplitIntoGroups<T>(IEnumerable<T> source,
int count)
{
List<T> result = new List<T>();
foreach (T item in source)
{
result.Add(item);
if (result.Count == count)
{
yield return result.ToArray();
result.Clear();
}
}
}

static IEnumerable<T> TakeN<T>(IEnumerable<T> source, int count)
{
foreach (T item in source)
{
if (count <= 0)
yield break;
--count;
yield return item;
}
}

static void Main()
{
foreach (int[] item in SplitIntoGroups(TakeN(Count(), 100), 3))
Console.WriteLine("({0}, {1}, {2})",
item[0], item[1], item[2]);
}
}
--->8---

-- Barry
 
B

Barry Kelly

Tin Gherdanarra said:
Here is my naive implementation that does not
work:

I thought I'd go through it and point out each place where there's
evidence of a faulty assumption.
namespace TestEnumeration
{
public class NumberEnumerable
{

// So far so good, spitting numbers
public IEnumerator<int> GetEnumerator()

Implementing an iterator which returns IEnumerator is useful when you
want to iterate over an existing collection contained inside the class.
It's handy when you're creating your own collections.

However, when you want to get generator-like chaining behaviour, you
should accept and return IEnumerable rather than IEnumerator.
{
int counter = 0;
for (; ; )
{
yield return counter++;
}
}
}

// The packer...
public class ThreeNumberEnumerable
{
int[] three = new int[3];
NumberEnumerable nen = new NumberEnumerable();

You don't want to keep these fields, because they'll be shared across
every enumerator, causing separate enumerations to interfere with each
other.
public IEnumerator<int[]> GetEnumerator()

Ditto for the IEnumerator vs IEnumerable as above.
{
for(;;)
{
for(int i = 0; i < 3; i++)
{
// Let's call GetEnumerator!
// Let's get numbers!
three = nen.GetEnumerator().Current;


Every time you call GetEnumerator, it starts again from the beginning.
Calling GetEnumerator() creates a new object. When you call MoveNext()
on a freshly created enumerator, it enters the start of
NumberEnumerable.GetEnumerator() method above. The reason you keep
getting 0 is that you keep creating a new enumerator every time around
the loop.
nen.GetEnumerator().MoveNext();
}

// Return the next 3-pack
yield return three;
}
}
}

class Program
{
static void Main(string[] args)
{
ThreeNumberEnumerable lb = new ThreeNumberEnumerable();

// DISAPPOINTMENT!
// We get packs of {0 0 0} -- the counter
// is reset each time

foreach (int[] i3 in lb)
{
Console.WriteLine("{0} {1} {2}", i3[0], i3[1], i3[2]);
}
}
}
}

All this means I can't CALL nen.GetEnumerator(), I have
to do something else, but what? Am I supposed to do it
this way in first place?

You could have called GetEnumerator() at the start of the function
(getting an IEnumerator), and worked with that IEnumerator throughout
the loop. It isn't as nice to deal with as foreach, though. Usually it's
best to structure your loops so that foreach is applied to the
enumerator if you can possibly help it.

-- Barry
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top