foreach enhancement

D

Daniel O'Connell [C# MVP]

Michael C said:
If other types implement these operators at some point, wouldn't it be
incumbent on the programmer to overload the operators if they wanted
different behavior?

Well, consider this:
[1...100] is an IList<int>.
[1...100] + [1...22] is effectivly IList<int> + IList<int>. As such a +
operator that operates on ILists is needed. However, if you provide a class
that implements IList<T> and overload operator+, what happens? I'm more
concerned about the general meaning than anything else. A psuedooperator
that adds lists adds all lists, I don't think its particularly clean to
limit it to one kind of list. Thus, you have potential conflicts.
The other issue, -[50...75] is confusing. Is it going to remove *every*
instance of 50...75 or just the first one, or just a segment running from 50
to 75? With a comprehension exactly what is going to happen is spelled out,
there is no room for confusion.

Confusion is at the core of this project. Not a lot of the desired end
results have been agreed upon, nor the underlying problem defined. One
person defines the problem as simple x..y ranges, in which overlapping
segments make no sense; one sees the usefulness of multiple ranges with
endpoints that are non-contiguous, another thinks overlap may be useful
somewhere down the road to somebody; another person sees ranges as being
part of a larger list type which may contain overlapping ranges or
non-contiguous ranges. So let's take a step back.

Confusion is often the nature of ideas discussed on public forums. There are
alot of ideas flying around, granted, and I'm tryign to cull through them
and get sane behaviour. I don't think that to many ideas or too many
opinions kills off a potential features value so much as makes it difficult
to discern the core idea and what needs to be done. Given enough time there
is a good chance a general consensus will be reached.

If not, no offense inteded to anyone, but as I am the one implementing it, I
am taking it upon myself to try to filter that and create the best possible
solution, even if one idea or another has to go by the wayside because of
it. The hard part is dumping my own ideas on the issue, however, ;).

Everyone's goals are a touch different. I'm pushing for a very abstract
concept where each individual syntax exists independently but can be
combined. I can't say for certain what exactly everyone else wants.
If we are looking at [1..100] as a list, which may contain multiple
ranges,
then let's start with the + operator. If the list allows multiple ranges,
the + operator would produce the following result:

[1..100] + [25..75] = [1..100, 25..75]

Therefore the += operator would produce similar results:

x = [1..100];
x += [25..75]; // x = [1..100, 25..75]

I suggested the * operator because a programmer might want to produce a
single range with no overlap in some situations (i.e., looping where
overlap
makes no sense). The | operator might make sense here also:

[1..100] * [25..150] = [1..150]

This leads to the shorthand *= also. As for the minus operator, that is
dependent partially on the definition of the actual problem from above.
Regardless, my thoughts are that most programmers expect their minus
operator to work inversely to the plus operator. So the - operator, as
inverse of the + operator, would provide the following:

x = [1..100, 25..75];
x += [25..75]; // x = [1..100, 25..75, 25..75]
x -= [25..75]; // x = [1..100, 25..75] - removes one range, as + adds one
range

Perhaps an operator could be agreed upon that also removes all of a
certain
type of ranges (i just put in the ! cause I'm running out of keyboard
chars -- think of it as a placeholder):

x = [1..100, 25..75];
x != [25..75]; // removes all subranges 25..75 from the list

These are just ideas, but the problem has to be clearly defined first.

I agree that there is a need to be able to express these ideas. I see you
prefer operators while I prefer the comprehension style. I am concerned
about operator overloading issues as above, but my primary goal with
comprehensions is that they are more flexible(and probably less performant,
bringing up your favorite adage again). A comprehension isn't limited to
joining, removing, and intersecting, but can be used to perform complex
operations on list contents, including transformations and lookups:
[yield GetValue(value) for 1...1000] for example;

I think the biggest problem is there *isn't* a problem, per se. Instead
there are places where several people can see room for improvement or where
they feel features would fit well(me and my list comprehension fetish, for
example). This leads to the issue of things that are nice to have, that may
well be noticeable improvements to the language and yet you still can't find
a valid arugment for supporting them. The best I can say is that I am
interested in exploring a clearer way to deal with ranges and lists.

As it stands, with the inclusion of generics I can only think of a single
real problem with the langauge, which is basically the general crappiness of
the casting operator.
Don't fret, just take a step back and clearly define the problem in terms
of
the end result. I have a pretty good idea of the end result I would like,

End results aren't the hard part. I have a pretty good grasp of the end
result and as much of a problem as one can muster here. The problem is every
practical way of tacking that result ends up having one showstopper issue or
another. Given the desire to maintain consistency, alot of rules have to be
considered and no one solution I've dreamt up yet maintains every rule and
still provides the feature set we are looking at. Many of the problems are
issues irregardless of if you use operators or comprehensions, the problem
comes in with lists, more specifically lists of lists, themselves.
but it doesn't necessarily mesh with all the other ideas being thrown
around. With that in mind, how about + to put a list in the list, and |
to
put the list's contents in the list w/ overlap and * to copy contents w/o
overlap... In that case:

[1..100] + [25..75] = [1..100, [25..75]]
[1..100] | [25..75] = [1..100, 25..75] // if this operator were
implemented
with overlap
[1..100] * [25..150] = [1..150] // if this operator were implemented with
no
overlap

How you would implement the list inside of a list, and what operator - if
any - to remove items from the list, is another story...

Thanks,
Michael C.
 
M

Michael C

If other types implement these operators at some point, wouldn't it be
incumbent on the programmer to overload the operators if they wanted
different behavior?

Well, consider this:
[1...100] is an IList<int>.
[1...100] + [1...22] is effectivly IList<int> + IList<int>. As such a +
operator that operates on ILists is needed. However, if you provide a class
that implements IList<T> and overload operator+, what happens? I'm more
concerned about the general meaning than anything else. A psuedooperator
that adds lists adds all lists, I don't think its particularly clean to
limit it to one kind of list. Thus, you have potential conflicts.

Regardless of what operator you use, or how you implement, you have to
provide some base functionality for managing lists. If we take a step back
and look at <int> + <int> and the overloaded operator + in <string> +
<string>, we see that in comparison
  • +
    • should behave the same
      whether we have a list of ints, strings, real numbers, enumerated types, or
      anything we can compose a list of. If you think of the list as a basic
      type, it becomes the programmer's responsibility to ensure proper
      functionality if he decides to override the default operators. You can warn
      programmers about the dangers, but you can't possibly account for every
      potential (mis)-use of them.
      Everyone's goals are a touch different. I'm pushing for a very abstract
      concept where each individual syntax exists independently but can be
      combined. I can't say for certain what exactly everyone else wants.

      That's the starting point - deciding exactly what you are trying to achieve.
      I agree that there is a need to be able to express these ideas. I see you
      prefer operators while I prefer the comprehension style. I am concerned
      about operator overloading issues as above, but my primary goal with
      comprehensions is that they are more flexible(and probably less performant,
      bringing up your favorite adage again). A comprehension isn't limited to
      joining, removing, and intersecting, but can be used to perform complex
      operations on list contents, including transformations and lookups:
      [yield GetValue(value) for 1...1000] for example;

      True, I do prefer operators mostly because of their simplicity. But if
      we're looking at list comprehensions, why re-invent the wheel from scratch?
      Just borrow implementation details from the Python syntax. Of course then
      we're talking about Python list objects and methods, which takes us back to
      the efficiency and readability when converted to C#.

      Thanks,
      Michael C.
 
D

Daniel O'Connell [C# MVP]

Well, consider this:
[1...100] is an IList<int>.
[1...100] + [1...22] is effectivly IList<int> + IList<int>. As such a +
operator that operates on ILists is needed. However, if you provide a class
that implements IList<T> and overload operator+, what happens? I'm more
concerned about the general meaning than anything else. A psuedooperator
that adds lists adds all lists, I don't think its particularly clean to
limit it to one kind of list. Thus, you have potential conflicts.

Regardless of what operator you use, or how you implement, you have to
provide some base functionality for managing lists. If we take a step
back
and look at <int> + <int> and the overloaded operator + in <string> +
<string>, we see that in comparison
  • +
    • should behave the same
      whether we have a list of ints, strings, real numbers, enumerated types,
      or
      anything we can compose a list of. If you think of the list as a basic
      type, it becomes the programmer's responsibility to ensure proper
      functionality if he decides to override the default operators. You can
      warn
      programmers about the dangers, but you can't possibly account for every
      potential (mis)-use of them.


    • I agree, however as it stands, lists *weren't* basic types at any point in
      time. I suppose I could drop any semblence of compatibility with existing
      code, but that makes the feature a bit less enticing.
      That's the starting point - deciding exactly what you are trying to
      achieve.

      I know what *I* want, its everyone else thats making life hard.
      I agree that there is a need to be able to express these ideas. I see you
      prefer operators while I prefer the comprehension style. I am concerned
      about operator overloading issues as above, but my primary goal with
      comprehensions is that they are more flexible(and probably less performant,
      bringing up your favorite adage again). A comprehension isn't limited to
      joining, removing, and intersecting, but can be used to perform complex
      operations on list contents, including transformations and lookups:
      [yield GetValue(value) for 1...1000] for example;

      True, I do prefer operators mostly because of their simplicity. But if
      we're looking at list comprehensions, why re-invent the wheel from
      scratch?
      Just borrow implementation details from the Python syntax. Of course then
      we're talking about Python list objects and methods, which takes us back
      to
      the efficiency and readability when converted to C#.
      Thus far I've taken pythons implementation pretty much to heart. I changed
      the syntax mainly because python's syntax exists in a pretty typeless world
      and has a few bits that look terrible in C# and because python is pretty
      typeless(or dynamically typed, or wahtever you want to call it), its hard to
      deal with multiple lists with similar syntax. Right now my comprehension
      syntax can only deal with *one* list. I wrote a syntax todeal with multiple
      lists, but I didn't like it(shown below). Out side of that I've made no
      changes.
      python list comprehension is something like:
      [x for x in 1...1000 if x%2==0]
      while my suggestion is
      [yield value for 1...1000 where value%2==0]
      or
      [1...1000 where value%2==0]
      I chose where because it reads better without a yield. I also elected to use
      value because everyone is used to value being implicitly typed, whereas
      custom implicitly typed variables are vastly against the grain. However, it
      does turn value into something more like a keyword than it is currently.

      My inital multi list sequence was:
      [yield new Pair(x,y) for x<- 1...1000, y<-2...2000]
      but it has that nasty implicit typing again.
      one other possibility may be multiple for's. Python apparently supports this
      but it seems pretty esoteric.
      [yield value for 1...1000 where value%2 == 0, for 2...2000 where value%2 ==
      1] perhaps? I need to research this more.

      However there is one issue .NET has that python was able to circumvent. In
      python, strings don't appear to be enumerable objects(or atleast the
      language can treat them as if they aren't), however in .NET they are. The
      biggest problem is determining *what* a user means when they pass a string
      to a list is hard. A string is both a basic type *and* a sequence of
      characters. That is a huge wrench in the works right now.
 
M

Michael C

Daniel O'Connell said:
python list comprehension is something like:
[x for x in 1...1000 if x%2==0]
while my suggestion is
[yield value for 1...1000 where value%2==0]
or
[1...1000 where value%2==0]

Most programmers already understand the for and foreach keywords, why
introduce unfamiliarity into a looping comprehension with 'yield'? How
about

[for value in 1..1000 where value%2 == 0]

or

[foreach value in ['a', 'b', 'c']]

The only 2 particular real problems I see (other than deciding what the
operators and delimiters should look like), are 1) dealing with lists inside
of lists and 2) lists composed of multiple types of values (i.e., [1, 'A',
3.141592, "hello"]). I don't see a problem with lists containing strings,
unless the user is trying to loop over a range of strings, as in
"abc".."xyz" in which case it would be hard to tell what the user wanted to
accomplish. In that instance, I would limit range declarations to simple
scalar types; although strings can be included in, and added to, lists, I
wouldn't allow strings to be used to declare ranges.

Cheers,
Michael C.
 
M

Michael C

My inital multi list sequence was:
[yield new Pair(x,y) for x<- 1...1000, y<-2...2000]
but it has that nasty implicit typing again.
one other possibility may be multiple for's. Python apparently supports this
but it seems pretty esoteric.
[yield value for 1...1000 where value%2 == 0, for 2...2000 where value%2 ==
1] perhaps? I need to research this more.

[for x, y in [1..1000, -2..2000]]

Where the first variable in the for list loops the first range and the
second variable loops the inner second range. Again we have to explicitly
type x and y for this to work... Why not use for to loop ranges for scalar
types (as above), and foreach to iterate non-scalars (such as strings):

list = ["red", "green", "blue"];
[foreach color in list]

By explicit definition [for ... in ...] would loop scalars whereas [foreach
.... in ...] would iterate a list (scalar or non-scalar). The explicit
typing of the [for] would allow for greater optimization, while the implicit
typing (object?) of [foreach] would allow more flexibility. Of course you
could still use [foreach] to iterate a range of values, but the implicit
typing would mean a performance hit.
 
D

Daniel O'Connell [C# MVP]

Michael C said:
Daniel O'Connell said:
python list comprehension is something like:
[x for x in 1...1000 if x%2==0]
while my suggestion is
[yield value for 1...1000 where value%2==0]
or
[1...1000 where value%2==0]

Most programmers already understand the for and foreach keywords, why
introduce unfamiliarity into a looping comprehension with 'yield'? How
about

[for value in 1..1000 where value%2 == 0]

or

[foreach value in ['a', 'b', 'c']]

Because you lose the ability to do, say, return 2x value or return a string
based on the value.
[yield GetStringFor(value) for 1...1000].
Python allows an initial expression to do this, in the form
[GetStringFor(x) for x in list]
However, that simply doesn't make much sense in the C# way of things. Plus I
think yield will become much more commonly understood with the spead of C#
2.0. Plus, even though its not terribly apparent, a list comprehension is
really a very quick, easy to write iterator. Yield works with both, tying
the meaning of yield in quite nicely I think.
This also leaves both yield and where as context senstive words as they are
only valid within a comprehension.

I also don't think there is any value in using foreach here. Your foreach
example is *identical* to ['a','b','c'] and [['a','b','c']] due to the rule
I explain below. It might be sensible to replace for with foreach in all
examples and drop the for keyword from consideration. Again, however, I am
seriously concerned with implicit typing.
The only 2 particular real problems I see (other than deciding what the
operators and delimiters should look like), are 1) dealing with lists
inside
of lists and 2) lists composed of multiple types of values (i.e., [1, 'A',
3.141592, "hello"]). I don't see a problem with lists containing strings,

Dealing with multiple types basically forces the list to be typed object. It
would probably only be really valueable for reference based things however.
Dynamic languages certainly have the upper hand here
unless the user is trying to loop over a range of strings, as in
"abc".."xyz" in which case it would be hard to tell what the user wanted
to
accomplish. In that instance, I would limit range declarations to simple
scalar types; although strings can be included in, and added to, lists, I
wouldn't allow strings to be used to declare ranges.

One of the current rules in my lists spec is that IEnumerable objects are
expanded into the list. This rule allows the expression 1...1000 to return
an IEnumerable value(in most cases), which allows ranges to be used as
generic IEnumerables, both in foreach, as parameters, or as variables. It
also means you can construct a list from another list or other IEnumerable.
However, string happens to implement IEnumerable, so the syntax ["water"]
means literally the same as ['w','a','t','e','r'], which is unwarrented
*AND* unwanted in most, but probably not all, cases. Trying to make that fit
in is a rather difficult proposition. If the type was something *other* than
strings, I wouldn't have a problem introducing override syntax. However I
expect strings to be rather common and requring special syntax for strings
or using special rules is probably going to bite back rather quickly.

So, now the issue is figuring out how to express ranges legally and usefully
in the type system without comprimising something else or figure out how to
handle strings and contained lists.
 
D

Daniel O'Connell [C# MVP]

Michael C said:
My inital multi list sequence was:
[yield new Pair(x,y) for x<- 1...1000, y<-2...2000]
but it has that nasty implicit typing again.
one other possibility may be multiple for's. Python apparently supports this
but it seems pretty esoteric.
[yield value for 1...1000 where value%2 == 0, for 2...2000 where value%2 ==
1] perhaps? I need to research this more.

[for x, y in [1..1000, -2..2000]]

What order does that return? all x's then all y's? or x, then y then x then
y and so on? How do you yield pairs? How do you perform joins?
Where the first variable in the for list loops the first range and the
second variable loops the inner second range. Again we have to explicitly
type x and y for this to work... Why not use for to loop ranges for
scalar
types (as above), and foreach to iterate non-scalars (such as strings):

list = ["red", "green", "blue"];
[foreach color in list]

By explicit definition [for ... in ...] would loop scalars whereas
[foreach
... in ...] would iterate a list (scalar or non-scalar). The explicit
typing of the [for] would allow for greater optimization, while the
implicit
typing (object?) of [foreach] would allow more flexibility. Of course you
could still use [foreach] to iterate a range of values, but the implicit
typing would mean a performance hit.

Again, I'm not sure if this really makes much sense. Each of these examples
is basically useless uses of the words. The for\foreach syntax doesn't add
anything to the list without some kind of processing.

Beyond that, since the lists are mostly strongly typed, I think you could
achieve optimizations using either for or foreach in the majority of cases,
even though it would include a sacrifice in clarity of the code to IL
mapping.
 
M

Michael C

I also don't think there is any value in using foreach here. Your foreach
example is *identical* to ['a','b','c'] and [['a','b','c']] due to the rule
I explain below. It might be sensible to replace for with foreach in all
examples and drop the for keyword from consideration. Again, however, I am
seriously concerned with implicit typing.

foreach would add value in cases where the range cannot be expanded to a
simple for loop and the items must be iterated over. Examples might be:

[foreach value in ["hello", 10, "how are you?", 'd', 3.141592]]

Obviously in this situation treating the values in the list like a range
makes no sense. You have to iterate over the 5 list items and return their
values one at a time... this comprises your looping in this instance. Also,
in this example implicit typing will cause a performance hit, but is really
the only way to accomplish the objective.
Dealing with multiple types basically forces the list to be typed object. It
would probably only be really valueable for reference based things however.
Dynamic languages certainly have the upper hand here

Well... if we're dealing with single-type lists, implicit typing should not
be an issue at all, as the reference variable should be explicitly typed
anyway. As in

int i;
[yield i for [1..1000]]
One of the current rules in my lists spec is that IEnumerable objects are
expanded into the list. This rule allows the expression 1...1000 to return
an IEnumerable value(in most cases), which allows ranges to be used as
generic IEnumerables, both in foreach, as parameters, or as variables.

Doesn't this mean that you'll later take a hit on optimization though? Like
optimizing loops? Also it will mean a good deal of memory dedicated to the
contents of large lists - [-32767..32767] will be a very large list indeed.
I don't have a particular problem with that, but how does enumerating the
individual items fit into your optimization scheme for looping? Or are you
still planning to do that?
However, string happens to implement IEnumerable, so the syntax ["water"]
means literally the same as ['w','a','t','e','r'], which is unwarrented
*AND* unwanted in most, but probably not all, cases. Trying to make that fit
in is a rather difficult proposition. If the type was something *other* than
strings, I wouldn't have a problem introducing override syntax. However I
expect strings to be rather common and requring special syntax for strings
or using special rules is probably going to bite back rather quickly.

Expanding the string "water" into 'w','a','t','e','r' in a list is probably
one of the most unnecessary things I've seen. Anyone who codes to that is
probably going to have a rude awakening if such an arcane feature
disappears. I don't see this as really being an issue. If you look at
ArrayLists, they implement IEnumerable, yet you can add strings to them and
retrieve them whole (not one char at a time) during the enumeration of the
ArrayList. I don't see how this violates anything, and I'm not sure how
implementing this same type of functionality in lists would be problematic.
If it does violate some rules of the road, at least Microsoft has set a
precendent for you with arrays, arraylists, and just about every other type
that can store strings and implements IEnumerable.
So, now the issue is figuring out how to express ranges legally and usefully
in the type system without comprimising something else or figure out how to
handle strings and contained lists.

From what you've said, it sounds like you already consider ranges to be
enumerations. And as I mentioned before, I would use ranges only where they
make sense - as shorthand for defining portions of the list. What areas
would this make sense in? Ints, Chars, Bytes, etc., etc. Basically any
type that has well-defined bounds and well-defined finite increments.
That's not to say you couldn't add other items to the list, but they would
need to be added on an individual basis. Ranges like ["hello".."cat"] and
[3.141592..2.8] make no sense to define as enumerable, incrementable ranges.
To add them to a list, however, [1..100, "hello", "cat", 3.141592, 2.8]
would make more sense, along with whatever in-between values the programmer
deems necessary. Again, you have to put some of the responsibility back on
the programmer. You can never write a tool that can be all things to all
people; but you can create tools that others can expand on later.

Thanks,
Michael C.
 
D

Daniel O'Connell [C# MVP]

Michael C said:
I also don't think there is any value in using foreach here. Your foreach
example is *identical* to ['a','b','c'] and [['a','b','c']] due to the rule
I explain below. It might be sensible to replace for with foreach in all
examples and drop the for keyword from consideration. Again, however, I
am
seriously concerned with implicit typing.

foreach would add value in cases where the range cannot be expanded to a
simple for loop and the items must be iterated over. Examples might be:

[foreach value in ["hello", 10, "how are you?", 'd', 3.141592]]

Obviously in this situation treating the values in the list like a range
makes no sense. You have to iterate over the 5 list items and return
their
values one at a time... this comprises your looping in this instance.
Also,
in this example implicit typing will cause a performance hit, but is
really
the only way to accomplish the objective.

Again, I see no reason why you can't merge this into one keyword. Two
different compiled representation of one keyword is irrelevent as long as
the *semantic* and effective behaviour is the same.
Dealing with multiple types basically forces the list to be typed object. It
would probably only be really valueable for reference based things however.
Dynamic languages certainly have the upper hand here

Well... if we're dealing with single-type lists, implicit typing should
not
be an issue at all, as the reference variable should be explicitly typed
anyway. As in

int i;
[yield i for [1..1000]]

IMHO, I'd like to find a way where the variable exists only *within* the
list, just as for and foreach loops work now.
One of the current rules in my lists spec is that IEnumerable objects are
expanded into the list. This rule allows the expression 1...1000 to
return
an IEnumerable value(in most cases), which allows ranges to be used as
generic IEnumerables, both in foreach, as parameters, or as variables.

Doesn't this mean that you'll later take a hit on optimization though?
Like
optimizing loops? Also it will mean a good deal of memory dedicated to
the
contents of large lists - [-32767..32767] will be a very large list
indeed.
I don't have a particular problem with that, but how does enumerating the
individual items fit into your optimization scheme for looping? Or are
you
still planning to do that?

This is where the compiler comes in. There are quite a few tricks that can
be pulled off by the compiler without the user being able to tell the
difference.
Consider these loops:

foreach (int i in 1...1000)
{

}

foreach (int i in [1...1000])
{

}

IEnumerable<int> enumer = 1...1000;
foreach (int i in enumer)
{

}

What do you expect to see? The compiler could easily transform the first two
into for loops while using the last as a literal enumeration(because it
can't guarentee that enumer is ranged). All of this is done based on
information the compiler has, specifically that in the first two cases you
are dealing with a generator and a list built off a generator, as such the
compiler can rather easily examine properties on the expression containing
high, low, and increment and generate the loop. The third, however, doesn't
exist as a range or list expression, it is instead a
LocalVariableExpression(or waht have you) which also resolves to
IEnumerable, but which cannot be optimzied.

The class in the compiler might be something like(discounting type's, which
complicates the compiler a bit)
class RangeExpression : Expression
{
public int UpperBound;
public int LowerBound;
public int Increment;
public Emit()
{
//generate a IEnumerable class *or* use a preexisting one
//create and load it.
}
}

and then the compiler could do
public void CompileForeach(Expression localExpression, Expression
enumeratorExpression)
{
....

if (expression is RangeExpression)
{
GenerateFor((RangeExpression)expression);
}
else
{
expression.Emit();
}
....
}
}

In esscense its entirely possible to optimize and still maintain 1...100 as
a enumerator. That particular magic is unsurprising, consisitent and
benificial, unlike some other bits.
It also makes this optimization optional, as not supporting it results in
code that is semantically identical.
However, string happens to implement IEnumerable, so the syntax ["water"]
means literally the same as ['w','a','t','e','r'], which is unwarrented
*AND* unwanted in most, but probably not all, cases. Trying to make that fit
in is a rather difficult proposition. If the type was something *other* than
strings, I wouldn't have a problem introducing override syntax. However I
expect strings to be rather common and requring special syntax for
strings
or using special rules is probably going to bite back rather quickly.

Expanding the string "water" into 'w','a','t','e','r' in a list is
probably
one of the most unnecessary things I've seen. Anyone who codes to that is
probably going to have a rude awakening if such an arcane feature
disappears. I don't see this as really being an issue. If you look at
ArrayLists, they implement IEnumerable, yet you can add strings to them
and
retrieve them whole (not one char at a time) during the enumeration of the
ArrayList. I don't see how this violates anything, and I'm not sure how
implementing this same type of functionality in lists would be
problematic.
If it does violate some rules of the road, at least Microsoft has set a
precendent for you with arrays, arraylists, and just about every other
type
that can store strings and implements IEnumerable.

Its easier to do in ArrayList. ArrayList gives you a nice selection of
methods, like Add, which have a defined purpose. [] however has to have
semantics to deal with both sequences and single objects. Lists and strings,
along with other IEnumerable objects, are unfortunately both, so which
behavior do you apply to any given class?

If a list generated by [ ] can accept some enumerable expressions, it would
have to accept them all. I'm really unsure of how to approach this. If
1...1000 is an enumerable expression, then wouldn't you expect
IEnumerable<int> x = 1...1000
[x]
to mean the same as
[1...1000]?

following that logic,
List<int> x = new List<int>();
x.Add(1);
x.Add(2);
x.Add(3);
....
x.Add(1000);
[x]

should mean the same shouldn't it?

what about
[GetRangeSequence(1,1000)]

public IEnumerable<int> GetRangeSequence(int start, int finish)
{
for (int i = start; i <= finish; i++)
yield return i;

}

should that result in the same list?

if you say yes to all of these, then why wouldn't
["water"]
and
['w','a','t','e','r']
be the same?
So, now the issue is figuring out how to express ranges legally and usefully
in the type system without comprimising something else or figure out how to
handle strings and contained lists.

From what you've said, it sounds like you already consider ranges to be
enumerations. And as I mentioned before, I would use ranges only where
they
make sense - as shorthand for defining portions of the list. What areas
would this make sense in? Ints, Chars, Bytes, etc., etc. Basically any
type that has well-defined bounds and well-defined finite increments.
That's not to say you couldn't add other items to the list, but they would
need to be added on an individual basis. Ranges like ["hello".."cat"] and
[3.141592..2.8] make no sense to define as enumerable, incrementable
ranges.
To add them to a list, however, [1..100, "hello", "cat", 3.141592, 2.8]
would make more sense, along with whatever in-between values the
programmer
deems necessary. Again, you have to put some of the responsibility back
on
the programmer. You can never write a tool that can be all things to all
people; but you can create tools that others can expand on later.

I think you miss the point. The issue isn't that strings wn't work in
ranges, they certainly don't make sense in them. The point is that a string
is indistiguishable from any other range when viewed through IEnumerable
colored glasses.
 
M

Michael C

Daniel O'Connell said:
[for x, y in [1..1000, -2..2000]]

What order does that return? all x's then all y's? or x, then y then x then
y and so on? How do you yield pairs? How do you perform joins?

To my thinking, this would return an enumeration for x [1..1000] and a
nested loop of y [-2..2000]. The syntax is similar to Python but the
behavior I'd have to look up. Not sure if Python nests the y inside the x
or if it performs the y loop subsequently to the x loop.
Again, I'm not sure if this really makes much sense. Each of these examples
is basically useless uses of the words. The for\foreach syntax doesn't add
anything to the list without some kind of processing.

The differentiation was strictly for your benefit - to ease your
implementation, since one requires strong typing and will let you know
upfront that it might be very optimizable; the other to let you know that
strong typing is not an option, and that optimization will not be able to be
as good. As long as you're willing to write the code to determine this
without the hints, you're good to go. It also appears that we're not as
concerned with optimization of for loops and typing - since lists will be
strongly typed. From what you've said, it appears that

[yield value in [1..10]]

is functionally and programmatically equivalent to

[yield value in [1,2,3,4,5,6,7,8,9,10]]

So that it cannot be optimized down to

for (value = 1; value <= 10; value++)

But rather something more like

foreach (value in new int [] {1, 2, 3, 4, 5, 6, 7, 8, 9, 10})

Is this correct? If so, then the for/foreach examples I presented are
unnecessary. Just to make sure we're on the same page, we're no longer
concerned with optimization, right?
Beyond that, since the lists are mostly strongly typed, I think you could
achieve optimizations using either for or foreach in the majority of cases,
even though it would include a sacrifice in clarity of the code to IL
mapping.

Well, again if we're dealing with strongly typed lists, then the why the
implicitly typed 'value' keyword? Just have the user declare a strongly
typed variable. It seems like we're bouncing back and forth on some of
these issues here. Do the lists have to be strongly typed, is implicit
typing an issue, what type of optimizations are you looking to implement?

The clarity of the code to IL mapping is probably not as critical to most
developers as it will be to you, since you're developing the actual tool.
As long as it doesn't cause you concern, it probably isn't too much of an
issue.

I guess one question I just have to ask is what exactly is a list? I've
worked with lists in various languages, and I'm viewing lists with the
preconceived notions here. Just for reference, how would *you* expand the
following statements out in C#?

List a = [1..100, 10..30, 500, 35];

List b = ["hello", "how are you?", "goodbye"];

[yield i in [1..100]]

[yield j in b]

And are you staying with strongly typed lists? If not, how would you
define, in C# terms, a list that is not strongly typed?

Thanks,
Michael C.
 
M

Michael C

Again, I see no reason why you can't merge this into one keyword. Two
different compiled representation of one keyword is irrelevent as long as
the *semantic* and effective behaviour is the same.

See other post. As long as you're willing to differentiate between weakly
typed lists and strongly typed lists; or prohibit weakly typed lists
altogether; and don't care about optimization hints, it isn't necessary and
one keyword is just fine.
Well... if we're dealing with single-type lists, implicit typing should
not
be an issue at all, as the reference variable should be explicitly typed
anyway. As in

int i;
[yield i for [1..1000]]

IMHO, I'd like to find a way where the variable exists only *within* the
list, just as for and foreach loops work now.

Simple.

[yield int i for [1..1000]]

Since it's declared within the scope of the list iterator, it exists only
within that scope just like a for loop.
This is where the compiler comes in. There are quite a few tricks that can
be pulled off by the compiler without the user being able to tell the
difference.
Consider these loops: ...
In esscense its entirely possible to optimize and still maintain 1...100 as
a enumerator. That particular magic is unsurprising, consisitent and
benificial, unlike some other bits.
It also makes this optimization optional, as not supporting it results in
code that is semantically identical.

This makes perfect sense when dealing with ranges and lists composed only of
ranges. Just one question - how would it handle the following:

[1..10]
[1,2,3,4,5,6,7,8,9,10]
If a list generated by [ ] can accept some enumerable expressions, it would
have to accept them all. I'm really unsure of how to approach this. If
1...1000 is an enumerable expression, then wouldn't you expect
IEnumerable<int> x = 1...1000
[x]
to mean the same as
[1...1000]?

following that logic,
List<int> x = new List<int>();
x.Add(1);
x.Add(2);
x.Add(3); ....

should that result in the same list?

if you say yes to all of these, then why wouldn't
["water"]
and
['w','a','t','e','r']
be the same?

For one, "water" is not, by definition, a range. Since we're talking about
expected results here, do you really expect that "water" should present an
enumerable Range of values like 1..1000? Or that List.Add("water") should
produce the same result as

List.Add('w');
List.Add('a');
List.Add('t');
List.Add('e');
List.Add('r');

I think the difference we're talking about here is the difference between
enumerable objects and enumerable Ranges. By definition, I would think that
a Range is composed of objects, and that the enumeration of a List would not
automatically break down and enumerate string objects in the List. Although
if the user wanted to address the individual list objects, they could
obviously enumerate them... Themselves, and one at a time.
I think you miss the point. The issue isn't that strings wn't work in
ranges, they certainly don't make sense in them. The point is that a string
is indistiguishable from any other range when viewed through IEnumerable
colored glasses.

See, difference of opinion - I think you miss the point.

Again, I go back to ArrayLists. ArrayLists are IEnumerable. You can add
strings to them. The ArrayList, during enumeration does not break down the
string into its component chars. So why is the eyeglass prescription so
different for a List when compared to other IEnumerables?

Especially since, unlike ArrayList, we're looking at *strongly* typed
lists... It's not like we're going to enumerate a list and hit <int>,
<int>, <string>... You know in advance whether it's a list of strings or a
list of ints, right? With that knowledge, why would enumerating a
strongly-typed list of strings be a problem? Especially since we already
know in advance that we can't define a range of strings; so we know in
advance that if we have a list of strings we're not going to enumerate them
into their individual chars or bits or other weirdness...
 
D

Daniel O'Connell [C# MVP]

Michael C said:
Daniel O'Connell said:
[for x, y in [1..1000, -2..2000]]

What order does that return? all x's then all y's? or x, then y then x then
y and so on? How do you yield pairs? How do you perform joins?

To my thinking, this would return an enumeration for x [1..1000] and a
nested loop of y [-2..2000]. The syntax is similar to Python but the
behavior I'd have to look up. Not sure if Python nests the y inside the x
or if it performs the y loop subsequently to the x loop.
Again, I'm not sure if this really makes much sense. Each of these examples
is basically useless uses of the words. The for\foreach syntax doesn't
add
anything to the list without some kind of processing.

The differentiation was strictly for your benefit - to ease your
implementation, since one requires strong typing and will let you know
upfront that it might be very optimizable; the other to let you know that
strong typing is not an option, and that optimization will not be able to
be
as good. As long as you're willing to write the code to determine this
without the hints, you're good to go. It also appears that we're not as

I think determining the two *shouldn't* be too terribly difficult.
Implementation is usually the easy part, its defining the proper behaviour
and making it work with every other rule that you have to abide by that will
make you tear your hair out and keep it in a rasin canister.

Thanks for the thought though.
concerned with optimization of for loops and typing - since lists will be
strongly typed. From what you've said, it appears that

[yield value in [1..10]]

is functionally and programmatically equivalent to

[yield value in [1,2,3,4,5,6,7,8,9,10]]

So that it cannot be optimized down to

for (value = 1; value <= 10; value++)

But rather something more like

foreach (value in new int [] {1, 2, 3, 4, 5, 6, 7, 8, 9, 10})

Is this correct? If so, then the for/foreach examples I presented are
unnecessary. Just to make sure we're on the same page, we're no longer
concerned with optimization, right?

They are both semantically identical; that is they *will* have the same end
result. However the actual implementation and function of the two may be
*very* different and the compiler itself will certainly be able to know the
difference. Since ranges have a defined start and end bound in many cases,
it wouldn't be hard for the compiler to play tricks and efficently generate
lists that take up very little space until they are actually used by
reserving indexes that exist as an enumerator but not actually generating
the values until they are accessed. A list comprehension is a different
matter and will usually result in a completely generated list since in most
any circumstance you use a where expression the ability to determine range
length goes out the window. It wouldn't be impossible for the compiler to
evaluate mathematic only where expression for ints and other baisc types and
determine the actual number of results, I'm just not sure I would want to go
that far unless it showed significant value.

It also wouldn't be beyond the compiler to take expressions like
[1,2,3,4,5,6,7,8,9,10] and transform them into 1...10 (or vice versa) if it
felt execution or memory efficency would benifit. For exceedingly small
cases like this, I would probably lean towards generating
[1,2,3,4,5,6,7,8,9,10] instead of forcing generation of each value.
Similarly if someone was insane enough to put 800000 direct sequential
entries of longs into a list I would think collapsing it into a range would
be better.

A pragma is probably appropriate to explicitly enable or disable this type
of optimization over a range, atleast for the proof of concept
implementation. It'd make testing easier and help out if there is a
particualr reason you want one or the other that the compiler can't
understand(like a massive number of random index lookups).
Well, again if we're dealing with strongly typed lists, then the why the
implicitly typed 'value' keyword? Just have the user declare a strongly
typed variable. It seems like we're bouncing back and forth on some of
these issues here. Do the lists have to be strongly typed, is implicit
typing an issue, what type of optimizations are you looking to implement?

The clarity of the code to IL mapping is probably not as critical to most
developers as it will be to you, since you're developing the actual tool.
As long as it doesn't cause you concern, it probably isn't too much of an
issue.

I guess one question I just have to ask is what exactly is a list? I've
worked with lists in various languages, and I'm viewing lists with the
preconceived notions here. Just for reference, how would *you* expand the
following statements out in C#?
Each of these is under my desires as well as under what the rules *would* do
in the compiler:
my intent first, actual effective results second
List a = [1..100, 10..30, 500, 35];
literally, not going to type everything out: what is typed
values 1 through 100, then 10 through 30, 500, and 35
same for the compiler
List b = ["hello", "how are you?", "goodbye"];
As it is here: hello, how are you?, and goodbye.
however, based on range rules the compier would generate each cahracter
seperately. Hopefully I explained that well enough in my other post.
[yield i in [1..100]] 1 through 100

[yield j in b]
If you follow my rules you get: hello, how are you? and goodbye.
if you follow the actual compiler rules you end up with each character
because b is already a list of each character.

This is where my issue is, creating rules for the compiler that will
evaluate properly, not nessecerily creating a working model in my mind.
And are you staying with strongly typed lists? If not, how would you
define, in C# terms, a list that is not strongly typed?
A list will be strongly typed where possible. However I'm implicitly typing
value because I don't want to have to type int i in a comprehension:
[yield i foreach int i in 1...1000]
isn't terribly attractive, although it is effective and perhaps nessecery.
Value is already implicitly typed in properties, indexers, etc.

The list [1...1000] *will* evaluate to IList<int>
 
D

Daniel O'Connell [C# MVP]

Michael C said:
Again, I see no reason why you can't merge this into one keyword. Two
different compiled representation of one keyword is irrelevent as long as
the *semantic* and effective behaviour is the same.

See other post. As long as you're willing to differentiate between weakly
typed lists and strongly typed lists; or prohibit weakly typed lists
altogether; and don't care about optimization hints, it isn't necessary
and
one keyword is just fine.
Well... if we're dealing with single-type lists, implicit typing should
not
be an issue at all, as the reference variable should be explicitly
typed
anyway. As in

int i;
[yield i for [1..1000]]

IMHO, I'd like to find a way where the variable exists only *within* the
list, just as for and foreach loops work now.

Simple.

[yield int i for [1..1000]]

Since it's declared within the scope of the list iterator, it exists only
within that scope just like a for loop.

That sitll doesn't explain how to return i*2.
[yield int i*2 for[1...1000]] doesn't make a whole lot of sense.
This is where the compiler comes in. There are quite a few tricks that
can
be pulled off by the compiler without the user being able to tell the
difference.
Consider these loops: ...
In esscense its entirely possible to optimize and still maintain 1...100 as
a enumerator. That particular magic is unsurprising, consisitent and
benificial, unlike some other bits.
It also makes this optimization optional, as not supporting it results in
code that is semantically identical.

This makes perfect sense when dealing with ranges and lists composed only
of
ranges. Just one question - how would it handle the following:

[1..10]
[1,2,3,4,5,6,7,8,9,10]

Explained that in the other post, we should probably merge these into one
sometime soon before things get confusing, LOL.
If a list generated by [ ] can accept some enumerable expressions, it would
have to accept them all. I'm really unsure of how to approach this. If
1...1000 is an enumerable expression, then wouldn't you expect
IEnumerable<int> x = 1...1000
[x]
to mean the same as
[1...1000]?

following that logic,
List<int> x = new List<int>();
x.Add(1);
x.Add(2);
x.Add(3); ...

should that result in the same list?

if you say yes to all of these, then why wouldn't
["water"]
and
['w','a','t','e','r']
be the same?

For one, "water" is not, by definition, a range. Since we're talking
about
expected results here, do you really expect that "water" should present an
enumerable Range of values like 1..1000? Or that List.Add("water") should
produce the same result as

List.Add('w');
List.Add('a');
List.Add('t');
List.Add('e');
List.Add('r');

I think the difference we're talking about here is the difference between
enumerable objects and enumerable Ranges. By definition, I would think
that
a Range is composed of objects, and that the enumeration of a List would
not
automatically break down and enumerate string objects in the List.
Although
if the user wanted to address the individual list objects, they could
obviously enumerate them... Themselves, and one at a time.

Howeve,r you miss a point. List.Add *doesn't* support adding ranges in any
way, shape, or form. *EVERYTHING* is a single object. If List.Add did double
duty, adding both ranges and single items you would probably have a similar
problem(the compiler choosing Add(IEnumerable) over Add(object) when you
pass it a string). It *NEVER* has to make a descision about what to do, its
behaviour is well defined and mapped out. As things stand with this list
syntax there is no clear definition that results in expected behaviour.

Also, to make a point, what I expect isn't the issue. I certainly don't want
strings to behave this way. The problem is I can't find a set of rules that
does what I do want *without* causing this behaviour in strings.

The issue really is, what is a range? Is a range simply a shortcut that
makes lists and some loops easier or is it a general purpose construct that
expresses a range of values? If its hte former you can throw this all away
and end up with significantly less flexibility, if its the former you have
to find a set of rules taht allows its flexibility without compromising
anything else.
See, difference of opinion - I think you miss the point.

Again, I go back to ArrayLists. ArrayLists are IEnumerable. You can add
strings to them. The ArrayList, during enumeration does not break down
the
string into its component chars. So why is the eyeglass prescription so
different for a List when compared to other IEnumerables?

Especially since, unlike ArrayList, we're looking at *strongly* typed
lists... It's not like we're going to enumerate a list and hit <int>,
<int>, <string>... You know in advance whether it's a list of strings or
a
list of ints, right? With that knowledge, why would enumerating a
strongly-typed list of strings be a problem? Especially since we already
know in advance that we can't define a range of strings; so we know in
advance that if we have a list of strings we're not going to enumerate
them
into their individual chars or bits or other weirdness...

You know ahead of time while enumerating, but the lists type is implicitly
determined from its contents. Just as the compiler infers that 1 is an
integer in most cases, it would infer that [1] is a list of integers. Thus,
the compiler decides what type the list is based on the contents of the
list, and thus it can't use the type of the list to determine how to use the
contents of said list.

This is a departure from standard C# behaviour, where everything is
explicitly typed, but I can't think of a really good way to do it any other
way. Ideally the object would support both IList<T> and IList<object> where
applicable, but that really is a different matter altogether.

Now, I'll give you taht you could explicitly ignore string when processing
IEnumerables, however that breaks consistency. Instead of saying "any
IEnumerable expression", it becomes "any IEnumerable expression except those
typed string" which is tricky, IMHO.
 
M

Michael C

Daniel O'Connell said:
Simple.

[yield int i for [1..1000]]

Since it's declared within the scope of the list iterator, it exists only
within that scope just like a for loop.

That sitll doesn't explain how to return i*2.
[yield int i*2 for[1...1000]] doesn't make a whole lot of sense.

It makes as much sense as [yield value*2 for [1..1000]]. I guess the value
of "value" is only valid within the [ ]'s? So how will you, for instance,
print 'value' to the console? Or use 'value' in a conditional statement or
anything else?

ranges. Just one question - how would it handle the following:

[1..10]
[1,2,3,4,5,6,7,8,9,10]

Explained that in the other post, we should probably merge these into one
sometime soon before things get confusing, LOL.

Agreed. LOL.
Also, to make a point, what I expect isn't the issue. I certainly don't want
strings to behave this way. The problem is I can't find a set of rules that
does what I do want *without* causing this behaviour in strings.

On a purely physical level, we can say the difference between ranges and
other types is the .. operator for starters. Obviously if the .. operator
is not there, it's not a range. By adding ranges, the compiler *will* be
making a choice about adding ranges or adding single objects, correct?
So... if we are not dealing with a range, we obviously add a single
object... whether string or int or whatever...
The issue really is, what is a range? Is a range simply a shortcut that
makes lists and some loops easier or is it a general purpose construct that
expresses a range of values? If its hte former you can throw this all away
and end up with significantly less flexibility, if its the former you have
to find a set of rules taht allows its flexibility without compromising
anything else.

I agree with the flexibility, but I also think there should be some level of
*consistency* in the user experience when designing a solution. For
instance, based on my experience with other IEnumerables, I would not expect
the following:

List.Add("water");
[yield value in List]

To return

w
a
t
e
r

This is based on my experience with other IEnumerables where you can expect
the following:

myArrayList.Add("water");
foreach (s in myArrayList)
Console.WriteLine (s);

Returns

water

Whatever rules you decide to implement, in whatever fashion, I would just
try to make sure that they provide consistency with other comparable types.
You know ahead of time while enumerating, but the lists type is implicitly
determined from its contents. Just as the compiler infers that 1 is an
integer in most cases, it would infer that [1] is a list of integers. Thus,
the compiler decides what type the list is based on the contents of the
list, and thus it can't use the type of the list to determine how to use the
contents of said list.

OK, you lost me here. The compiler *can* determine the type of the list,
yet it *can't* use that information? Think about that... It just makes no
sense at all.
Now, I'll give you taht you could explicitly ignore string when processing
IEnumerables, however that breaks consistency. Instead of saying "any
IEnumerable expression", it becomes "any IEnumerable expression except those
typed string" which is tricky, IMHO.

Now we get to the core... It's tricky. We know how we think it should
operate; and I think you might agree that a string within a list should be
returned as a complete string when the list is enumerated. So the only
major sticking point is the trickiness.

As for consistency, IMHO, the main consistency issue that needs to be
addressed is consistency in the overall user experience. If the average
user would expect that adding a string to a list would result in that entire
string being returned - intact - during enumeration of the list, then that's
what should be implemented; regardless of any esoteric rules regarding
differentiation of IEnumerables. Perhaps taking another step back, the
major issue here seems to be Ranges and their conflict with strings/ Should
we do as you said above and re-think the definition of a range? It seems a
range should be an IEnumerable type of its own, in which case you need to
check for IEnumerable Ranges versus other types and treat them differently,
as opposed to treating strings differently.
 
D

Daniel O'Connell [C# MVP]

Michael C said:
Daniel O'Connell said:
Simple.

[yield int i for [1..1000]]

Since it's declared within the scope of the list iterator, it exists only
within that scope just like a for loop.

That sitll doesn't explain how to return i*2.
[yield int i*2 for[1...1000]] doesn't make a whole lot of sense.

It makes as much sense as [yield value*2 for [1..1000]]. I guess the
value
of "value" is only valid within the [ ]'s? So how will you, for instance,
print 'value' to the console? Or use 'value' in a conditional statement
or
anything else?

For more complicated situations like that, a foreach loop makes *alot* more
sense. List comprehensions are basically for simple list transformations and
to express more complicated ranges succinctly.
ranges. Just one question - how would it handle the following:

[1..10]
[1,2,3,4,5,6,7,8,9,10]

Explained that in the other post, we should probably merge these into one
sometime soon before things get confusing, LOL.

Agreed. LOL.
Also, to make a point, what I expect isn't the issue. I certainly don't want
strings to behave this way. The problem is I can't find a set of rules that
does what I do want *without* causing this behaviour in strings.

On a purely physical level, we can say the difference between ranges and
other types is the .. operator for starters. Obviously if the .. operator
is not there, it's not a range. By adding ranges, the compiler *will* be
making a choice about adding ranges or adding single objects, correct?
So... if we are not dealing with a range, we obviously add a single
object... whether string or int or whatever...
The issue really is, what is a range? Is a range simply a shortcut that
makes lists and some loops easier or is it a general purpose construct that
expresses a range of values? If its hte former you can throw this all
away
and end up with significantly less flexibility, if its the former you
have
to find a set of rules taht allows its flexibility without compromising
anything else.

I agree with the flexibility, but I also think there should be some level
of
*consistency* in the user experience when designing a solution. For
instance, based on my experience with other IEnumerables, I would not
expect
the following:

List.Add("water");
[yield value in List]

To return

w
a
t
e
r

Nor would I, nor would it really.
This is the tricy part, while
List.Add("water");
[yield value in List]

will return a list with one value, that being "water",
List = ["water"]
[yield value in List]
would result in
w
a
t
e
r

It has nothing to do with enumeration as such, it is entirely based in
interpreting list items. How do you interpret ["abc"]. Is it a string or is
it a IEnumerable object.
You know ahead of time while enumerating, but the lists type is
implicitly
determined from its contents. Just as the compiler infers that 1 is an
integer in most cases, it would infer that [1] is a list of integers. Thus,
the compiler decides what type the list is based on the contents of the
list, and thus it can't use the type of the list to determine how to use the
contents of said list.

OK, you lost me here. The compiler *can* determine the type of the list,
yet it *can't* use that information? Think about that... It just makes
no
sense at all.

Well, I guess I should rephrase that: Since the compiler uses the type of
its entries to infer the type of the list, the type of the list cannot be
used to infer the type of the entries.
In other words, you can't decide whether ["abc"] is a list of three
characters or a list of one string based on the list being typed string or
character because its recursive, the list type is based on the
interpretation of the type you need the list type to interpret.
 
M

Michael C

It makes as much sense as [yield value*2 for [1..1000]]. I guess the

For more complicated situations like that, a foreach loop makes *alot* more
sense. List comprehensions are basically for simple list transformations and
to express more complicated ranges succinctly.

But even in a situation like this, where I simply want to write the value*2
to the screen, where do I put the Console.WriteLine (value) at in this
statement:

[yield value*2 for [1..1000]]
It has nothing to do with enumeration as such, it is entirely based in
interpreting list items. How do you interpret ["abc"]. Is it a string or is
it a IEnumerable object.

Well, for our purposes it's obviously a string. And a string, while
IEnumerable, must be explicitly enumerated, no? We're obviously not
explicitly enumerating it; it seems to have more to do with setting up
special handling for Ranges versus other objects, since Ranges are really
the only IEnumerables we want to implicitly enumerate in the list.
Well, I guess I should rephrase that: Since the compiler uses the type of
its entries to infer the type of the list, the type of the list cannot be
used to infer the type of the entries.
In other words, you can't decide whether ["abc"] is a list of three
characters or a list of one string based on the list being typed string or
character because its recursive, the list type is based on the
interpretation of the type you need the list type to interpret.

I haven't actually built a compiler since I made a small C compiler back in
college, but this doesn't seem to follow. The compiler knows whether it is
a list of strings or list of chars, if for no other reason, the " and '.
Based on that, you could, if you were so inclined, create a special case for
strings. I understand you don't want to do that, but you have to actually
create a special case at some point. You automatically, by definition, have
two special cases:

1. List Items that should be implicitly enumerated when the list is
enumerated, and
2. List Items that should not be implicitly enumerated when the list is
enumerated

List Items that would fit into case 1 would include Ranges (and if you were
so inclined to add them later, sub-lists, arrays, etc.) List Items that
would fit into case 2 would include ints, chars, strings, etc. The
difference is you have to look at Ranges as objects unto themselves which
can be components in the list.
 
D

Daniel O'Connell [C# MVP]

Michael C said:
It makes as much sense as [yield value*2 for [1..1000]]. I guess the

For more complicated situations like that, a foreach loop makes *alot* more
sense. List comprehensions are basically for simple list transformations and
to express more complicated ranges succinctly.

But even in a situation like this, where I simply want to write the
value*2
to the screen, where do I put the Console.WriteLine (value) at in this
statement:

[yield value*2 for [1..1000]]

Hrmm, I think that it shouldn't be possible. List comprehensions are
supposed to provide list transformations, that is converting one list into
another based on simple rules. I don't know if it is appropriate for general
iteration syntax.
I'd rather see
foreach (int i in [yield value*2 in 1...1000])
Console.WriteLine(i);
It has nothing to do with enumeration as such, it is entirely based in
interpreting list items. How do you interpret ["abc"]. Is it a string or is
it a IEnumerable object.

Well, for our purposes it's obviously a string. And a string, while
IEnumerable, must be explicitly enumerated, no? We're obviously not
explicitly enumerating it; it seems to have more to do with setting up
special handling for Ranges versus other objects, since Ranges are really
the only IEnumerables we want to implicitly enumerate in the list.

Well, there are other lists and array's and possibly the odd iterator or
two.
Well, I guess I should rephrase that: Since the compiler uses the type of
its entries to infer the type of the list, the type of the list cannot be
used to infer the type of the entries.
In other words, you can't decide whether ["abc"] is a list of three
characters or a list of one string based on the list being typed string
or
character because its recursive, the list type is based on the
interpretation of the type you need the list type to interpret.

I haven't actually built a compiler since I made a small C compiler back
in
college, but this doesn't seem to follow. The compiler knows whether it
is
a list of strings or list of chars, if for no other reason, the " and '.

Well, I think maybe I misunderstood your original intent. I thought you had
implied that the compiler should decide on the interpretation of string
based on the list type, which isn't possible because the lists type depends
on how you interpert the string. In light of all your other comments I
suspect you were implying something different.
Based on that, you could, if you were so inclined, create a special case
for
strings. I understand you don't want to do that, but you have to actually
create a special case at some point. You automatically, by definition,
have
two special cases:

1. List Items that should be implicitly enumerated when the list is
enumerated, and
2. List Items that should not be implicitly enumerated when the list is
enumerated
(minor disagreement here, I'm concerned with list generation, not
enumeration. By the time enumeration rolls around this should all have been
sorted out).
List Items that would fit into case 1 would include Ranges (and if you
were
so inclined to add them later, sub-lists, arrays, etc.) List Items that
would fit into case 2 would include ints, chars, strings, etc. The
difference is you have to look at Ranges as objects unto themselves which
can be components in the list.
Right. There are there goals I'm working towards:
1) allow ranges in lists
2) allow ranges to be placed in variables.
3) allow variables holding ranges to be loaded into lists
4) allow sub lists, arrays, etc to be added to lists.
5?) allow lists to be added to lists?

ICollection could be used to express all of these with out matching strings,
which makes it look idea. But that still takes away the ability to create
lists of arrays or other lists. At the least one would want to be able to
define
list = [1,2,3];
list = [list,4];
right?
Maybe ICollection *is* better. Lists of lists could be constrained by using
a syntax that explicitly marks an ICollection as a single object, not as a
list to enumerate?
 
M

Michael C

Daniel O'Connell said:
Hrmm, I think that it shouldn't be possible. List comprehensions are
supposed to provide list transformations, that is converting one list into
another based on simple rules. I don't know if it is appropriate for general
iteration syntax.
I'd rather see
foreach (int i in [yield value*2 in 1...1000])
Console.WriteLine(i);

That's perfect. I was just wondering how we would be able to use the values
yielded by the list comprehensions in code. In this case, I rather prefer
your previous recommendation of using [yield value...from...], as in is
already a reserved keyword and the 2 in's almost back to back does look a
little funny. But that's just a little thought on the cosmetics and not all
that important really.
Well, there are other lists and array's and possibly the odd iterator or
two.

True, I was referring to strings as the primary sticking point we're
discussing - the strings being IEnumerable objects that we don't want to
implicitly enumerate.
Well, I think maybe I misunderstood your original intent. I thought you had
implied that the compiler should decide on the interpretation of string
based on the list type, which isn't possible because the lists type depends
on how you interpert the string. In light of all your other comments I
suspect you were implying something different.

True, I was thinking the compiler would know the type of list based on the
contents of the list; and I was thinking it should determine a path of
action based on that information. Most cases I would think could be broken
down into two paths depending on the type of list enumerated: Lists
composed of objects we want to implicitly enumerate should be enumerated
(arrays, lists, ranges, etc.); Lists composed of objects we don't want to
implicitly enumerate should not be implicitly enumerated when the list ie
enumerated (strings, ints, etc). I know it makes it more complex to have
two separate paths of action, but I don't see any other particular way to
handle it...
(minor disagreement here, I'm concerned with list generation, not
enumeration. By the time enumeration rolls around this should all have been
sorted out).

It does depend on how we're storing the list. This will sort itself out
only if we expand the list into it's composite objects at list generation
time, so that:

aList = [1..4] -> converted to aList = [1,2,3,4] at list generation time and
stored as 4 separate comoposite objects at generation

Alternatively, I thought we were looking to generate the list above as:

aList = [1..4] -> converted to aList = [Range(1..4)] so that the single
composite object Range would be stored

If this is the case then we have one action path at list generation and 2
separate action paths at list enumeration time, based on each type of
object.
Right. There are there goals I'm working towards:
1) allow ranges in lists
2) allow ranges to be placed in variables.
3) allow variables holding ranges to be loaded into lists

If we define Ranges as a type, we should be able to do this without much
trouble.
4) allow sub lists, arrays, etc to be added to lists.
5?) allow lists to be added to lists?

A question about adding arrays, lists, ranges - In the following:

Range = 1..5;
List = [Range];
Range = 6..10;

The contents of the List would be populated via the Ranges ICloneable
interface? So that List would not change, or does the List actually contain
a reference to Range (in which case the contents of List would be [6..10]
after Range is changed)?
Maybe ICollection *is* better. Lists of lists could be constrained by using
a syntax that explicitly marks an ICollection as a single object, not as a
list to enumerate?

The ICollection does look like it might be a way to go. I think that's more
of the direction I'm leaning - a single List object composed of other single
objects (Range objects, string objects, list objects, chars, ints, etc.)
 
D

Daniel O'Connell [C# MVP]

Michael C said:
Daniel O'Connell said:
Hrmm, I think that it shouldn't be possible. List comprehensions are
supposed to provide list transformations, that is converting one list
into
another based on simple rules. I don't know if it is appropriate for general
iteration syntax.
I'd rather see
foreach (int i in [yield value*2 in 1...1000])
Console.WriteLine(i);

That's perfect. I was just wondering how we would be able to use the
values
yielded by the list comprehensions in code. In this case, I rather prefer
your previous recommendation of using [yield value...from...], as in is
already a reserved keyword and the 2 in's almost back to back does look a
little funny. But that's just a little thought on the cosmetics and not
all
that important really.

Ya, I was tired when I responded. That should have been [yield value*2 for
1...1000]. Comprehensions always generate a new list, with all the benifits
and drawbacks of that. You can iterate over it or whatever else you wish to
do with that particularly typed list.

Note that back to back in's is a big reason why I was pushing for isin in
another portion of the thread.
True, I was referring to strings as the primary sticking point we're
discussing - the strings being IEnumerable objects that we don't want to
implicitly enumerate.

Ya, I just don't like arbitrarily choosing which the compiler will handle.
It ends up being to much like compiler magic and will probably confuse
users.
Well, I think maybe I misunderstood your original intent. I thought you had
implied that the compiler should decide on the interpretation of string
based on the list type, which isn't possible because the lists type depends
on how you interpert the string. In light of all your other comments I
suspect you were implying something different.

True, I was thinking the compiler would know the type of list based on the
contents of the list; and I was thinking it should determine a path of
action based on that information. Most cases I would think could be
broken
down into two paths depending on the type of list enumerated: Lists
composed of objects we want to implicitly enumerate should be enumerated
(arrays, lists, ranges, etc.); Lists composed of objects we don't want to
implicitly enumerate should not be implicitly enumerated when the list ie
enumerated (strings, ints, etc). I know it makes it more complex to have
two separate paths of action, but I don't see any other particular way to
handle it...
(minor disagreement here, I'm concerned with list generation, not
enumeration. By the time enumeration rolls around this should all have been
sorted out).

It does depend on how we're storing the list. This will sort itself out
only if we expand the list into it's composite objects at list generation
time, so that:

aList = [1..4] -> converted to aList = [1,2,3,4] at list generation time
and
stored as 4 separate comoposite objects at generation

Alternatively, I thought we were looking to generate the list above as:

aList = [1..4] -> converted to aList = [Range(1..4)] so that the single
composite object Range would be stored

If this is the case then we have one action path at list generation and 2
separate action paths at list enumeration time, based on each type of
object.
The actual case will be something more like
public class InternalList<T>
{
public T this[int index]
{
get
{
//look up index, if its within a range, expand the range
//otherwise llook the index up directly.
}
}
}

It'll result in a more complicated list class (to support range splitting,
etc for inserts), but I can handle writing that in a support class. This
particular feature will probably require a extra library simply to allow
inter-class compatibilty.
If we define Ranges as a type, we should be able to do this without much
trouble.

Ya, except I wanted to define Ranges as IEnumerables, which doesn't work. ;)
4) allow sub lists, arrays, etc to be added to lists.
5?) allow lists to be added to lists?

A question about adding arrays, lists, ranges - In the following:

Range = 1..5;
List = [Range];
Range = 6..10;

The contents of the List would be populated via the Ranges ICloneable
interface? So that List would not change, or does the List actually
contain
a reference to Range (in which case the contents of List would be [6..10]
after Range is changed)?

In that case its irrelevent.
Range = 6...10; creates a *new* object. You would just be replacing a
variables value, not the variable itself. IMHO, ranges should be immutable
and joins should result in an entirely new range being generated.
 
M

Michael C

Note that back to back in's is a big reason why I was pushing for isin in
another portion of the thread.

Yeah, when I saw how the comprehensions interact with other code, the 'in'
problem became clear.
Range = 1..5;
List = [Range];
Range = 6..10;

In that case its irrelevent.
Range = 6...10; creates a *new* object. You would just be replacing a
variables value, not the variable itself. IMHO, ranges should be immutable
and joins should result in an entirely new range being generated.

That may be the case, but if we're assigning the range 1..5 to a variable,
and later assign the range 6..10 to the same variable... I suppose my first
question is what happens to the original 1..5 range? I would expect that
(if it were an object) it would just be disposed of and the 6..10 range
replace it. The main question though was what does the variable List
contain after the instructions above? Is it [1..5], [6..10] or some other
value/values?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top