To LinQ or not LinQ

Gareth Erskine-Jones · Dec 6, 2008

a sequence has always an order, namely the order in which you traverse
the sequence. Applying an order action onto a set doesn't make it a
sequence: I can pick items from the set at random from the set even if
it's ordered in a different order. That's a key difference, albeit
theoretical.

I'm not sure that's a "key" difference (I'm not sure how a difference
could be a "key" difference if it's only of theoretical interest) - a
relation (or set as you say) in relational theory has no order (in the
theory) but is stored in some order, and random selection from it
involves traversing the rows to find the required ones (indexes are
relevant only from a performance point of view). A sequence, however
it is ordered, is almost identical.

Forgive my boldness in challenging you in this way - I'm well aware
that my knowledge is relatively weak in this area - I only enter in to
these discussions in an attempt to learn & understand more.

C#, ASP.NET development and contracting services, London, UK
http://www.sgat-computing-services.co.uk/

Gareth Erskine-Jones · Dec 6, 2008

That said, I'm a bit puzzled by the direction this debate's taken. The
assertion that SQL is _strictly_ set-based doesn't make sense to me
either, since when you query the database, records always have to be
returned in _some_ order. Even if the database doesn't impose a specific
order on the data, it still has some inherent order, just as a
theoretically orderless collection in .NET still has order.

I fully agree - and reading books by people like Date (and of course
Codd) reveals their extreme love of the pure relational database, and
their equally extreme distaste for SQL which they think violates the
model in many ways.

Likewise, while it's true that the LINQ syntax is very much centered
around the IEnumerable<T> interface, there's nothing about the interface
that requires the sequences to be well-defined. A collection _could_ in
fact return elements in random order for consecutive iterations. It's
only practicality that causes that not to happen in practice.

Try writing a poker application which has an IEnumerable<Card>
interface for the Deck class, and does indeed return elements in
(pseudo-) random order ;-)

This whole "one deals in sets,
the other deals with sequences" things seems like a complete non-starter.

Again I agree - although the argument might have more validity if
instead of SQL one of the various relational languages that seem to
exist only in the books of database theorists were used instead of
SQL.

C#, ASP.NET development and contracting services, London, UK
http://www.sgat-computing-services.co.uk/

Gareth Erskine-Jones · Dec 6, 2008

Perhaps you should read more into set-oriented algebra vs. sequence
operations then. Or for example try to project linq queries onto sql and
you'll then see that for many queries a conversion has to take place
which isn't obvious.

Any chance of a couple of simple examples that would illustrate this
point - or links to relevant articles. I'd be genuinely interested.

thanks,

C#, ASP.NET development and contracting services, London, UK
http://www.sgat-computing-services.co.uk/

Frans Bouma [C# MVP] · Dec 6, 2008

Gareth said:
Any chance of a couple of simple examples that would illustrate this
point - or links to relevant articles. I'd be genuinely interested.

Here's one.
from c in ctx.Customer
group c by c.Country into g
select g;

This one gives a hierarchical resultset because it can build one when
consuming the stream. However, in SQL you can't do this.

Here's another
from c in ctx.Customer
let totalOrders = c.Orders.Count
where c.Country=="Germany"
select new {c, totalOrders};

Here, a scalar is produced high up in the query for a given customer,
however in sql this can't be done, you have to move the scalar into the
projection. If you think this is 'easy', it's not, as 'totalOrders' has
to be replaced with the scalar query. Needless to say, that multiple
let's will make things even more different.

THere are many others. For example the method 'Reverse' or
'SequenceEqual' aren't usable on a DB. 'Reverse' might sound like it is
usable, but this query proves it's not:
(from o in ctx.Order
orderby o.Customer.Country ascending
select o).Reverse();

in the db, this has to know the ordering given to reverse the set, and
even then it's not always possible. In memory, with a sequence, it's
clear. SequenceEqual is also a method which is undoable in SQL, however
in linq in memory with two sequences it works without a problem.

FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------

Frans Bouma [C# MVP] · Dec 6, 2008

Gareth said:
I'm not sure that's a "key" difference (I'm not sure how a difference
could be a "key" difference if it's only of theoretical interest) - a
relation (or set as you say) in relational theory has no order (in the
theory) but is stored in some order, and random selection from it
involves traversing the rows to find the required ones (indexes are
relevant only from a performance point of view). A sequence, however
it is ordered, is almost identical.

Forgive my boldness in challenging you in this way - I'm well aware
that my knowledge is relatively weak in this area - I only enter in to
these discussions in an attempt to learn & understand more.

It's a theoretical difference. The people who countered my point all
come up with a side effect of how some databases are implemented. That's
not the point. The point is what SQL is all about and what it doesn't
do. For example SELECT * FROM Table has by definition no ordering, it's
a set. To CONSUME the set, one could traverse it sequentially with a
cursor. However that's a technical implementation detail of a client API
utilizing the SQL engine. For example, if I do:
UPDATE Table
SET F1=F1+2
WHERE ID IN
(
SELECT ID FROM Table2
WHERE SomeField=@value
)

In which order are the rows in table updated? Undefined, because the
order of the select set is undefined and update works on a set, not a
sequence.

This is a fundamental element of understanding what SQL is and how to
use it. Too many times I see the question why:
SELECT * FROM Customers
ORDER BY Country ASC

on northwind gives them undeterministic results even though they
specified an ordering, so a cursor consuming the set could expect a
given ordering. It's because more than 1 row could have the same value
for 'Country' and therefore the order in which these rows are returned
is undefined. Even with a cursor. Sure, a cursor consumes the set in
'an' order, but not a deterministic order as with a sequence.

Mind you: a cursor is the sequence mechanism used to consume the set,
but that doesn't make the set have an ordering.

FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------

Frans Bouma [C# MVP] · Dec 6, 2008

Gareth said:
surely that *is* syntax.
?

while that is symantics.

If you call that semantics, you clearly have no clue.

SQL being about sets without ordering is a fundamental part of the SQL
language and also a core issue for many problems people have with
consuming sets through cursors (e.g. datareader).

Too many details here for me . For one thing - the relational
database model is set oriented (although "relation" seems to be the
preferred term). SQL gets a pretty bad slating from most of the
relational database gurus though - on the grounds that it's based on
tables, not relations (the former can have duplicate rows, the latter
can't), and that "ORDER BY" does tend to sink the "sets don't have an
ordering" argument.

Date hates everything he didn't cook up himself, so take his opinion
with a grain of salt. That said, I doubt it that the 'guru's' will say
that SQL is bad with respect to relational models: SQL is a language to
WORK with relational databases. It's a common mistake to think that
relational models are tightly coupled with SQL or that SQL is the
foundation of relational models. It's not. Relational models are
abstract definitions. SQL is often used to define the implementations of
these models in relational databases, but it could well be possible that
you use a different language for that. The RDBMS clearly doesn't care
about that, as it works with relational algebra underneath.

FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------

Gareth Erskine-Jones · Dec 6, 2008

It's a theoretical difference. The people who countered my point all
come up with a side effect of how some databases are implemented. That's
not the point. The point is what SQL is all about and what it doesn't
do. For example SELECT * FROM Table has by definition no ordering, it's
a set. To CONSUME the set, one could traverse it sequentially with a
cursor. However that's a technical implementation detail of a client API
utilizing the SQL engine. For example, if I do:
UPDATE Table
SET F1=F1+2
WHERE ID IN
(
SELECT ID FROM Table2
WHERE SomeField=@value
)

In which order are the rows in table updated? Undefined, because the
order of the select set is undefined and update works on a set, not a
sequence.

I do understand the difference between procedural code which would
"care" about the order of the data, and more declarative languages
(like SQL) which do not (at least, so far as the (a) theory and (b)
what is exposed to the user, is concerned).

This is a fundamental element of understanding what SQL is and how to
use it.

I think it's a fundamental mistake to confuse SQL with the relational
model. SQL tables are *not* sets (or relations) - not least because
sets / relations shouldn't be able to hold duplicate rows while SQL
allows this quite happily.

Gareth Erskine-Jones · Dec 6, 2008

?

It's up to you, but I'd have preferred more than "?". Syntax is the
set of rules governing which symbols (including keywords) can be
placed where in a valid sentance. It's quite possible for two
languages to be utterly different in other ways while sharing a large
amount of syntax.

If you call that semantics, you clearly have no clue.

If this is going to decend into rudeness, then I'll drop out - I try
not to be unpleasant when I post, and I don't enjoy discussions where
insults are thrown out rather than actual constructive criticism. Yes,
you could say, "well usenet is like that", but it's not all like that
- I've used it for years, and have generally managed to get great
benefit from it whilst avoiding getting involved in exchanges like
that.

You say SQL is set oriented and Linq is sequence oriented. If you are
speaking of SQL as a language and Linq as a language, then the fact
that the languages differ in meaning even when they superficially look
the same is a semantic difference. If you have a different definition
of semantic I'd be interested to hear it.

Date hates everything he didn't cook up himself, so take his opinion
with a grain of salt.

I take everything anyone says with a grain of salt :-)

.

Date's views on NULLs are rather extreme in my view, but he's
perfectly correct that there are many aspects of SQL which mean that
SQL tables / views etc. are not in fact relations in the sense of the
relational algebra.

That said, I doubt it that the 'guru's' will say

The semantics of that are clear - the syntax is a little off though
:-)

that SQL is bad with respect to relational models: SQL is a language to
WORK with relational databases. It's a common mistake to think that
relational models are tightly coupled with SQL or that SQL is the
foundation of relational models. It's not. Relational models are
abstract definitions. SQL is often used to define the implementations of
these models in relational databases, but it could well be possible that
you use a different language for that.

Indeed, and there are such languages (mostly of academic interest),
which, when used to definte relational databases, would result in
databases which were firmly based on the relational model - rather
than being rather loosely based on it, as happens with SQL.

Frans Bouma [C# MVP] · Dec 7, 2008

Gareth said:
I think it's a fundamental mistake to confuse SQL with the relational
model. SQL tables are *not* sets (or relations) - not least because
sets / relations shouldn't be able to hold duplicate rows while SQL
allows this quite happily.

erm... a 'relation' (Codd) is a set, and although the rows are
duplicate, it doesn't mean they're copies of the same instance.

A
resultset from a SQL select statement is a new relation, and forms a set
of instances of an entity definition. If you have duplicates in the set,
it means the entity definition doesn't contain a unique identifying
attribute, otherwise duplicates wouldn't occur.

that sql can form new sets which can be used as relations in other set
based operations is one of the key differences between OO DB's and
relational databases and why relational databases for example often be a
better fit for the consumer of the set.

But I get the feeling we're bickering on tiny details of definitions
which are apparently not the same

FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------

Frans Bouma [C# MVP] · Dec 7, 2008

Gareth said:
It's up to you, but I'd have preferred more than "?". Syntax is the
set of rules governing which symbols (including keywords) can be
placed where in a valid sentance. It's quite possible for two
languages to be utterly different in other ways while sharing a large
amount of syntax.

I didn't understand what you meant with the remark, hence my '?'. I
know what 'syntax' means, though I also think I know what people mean
with 'sql like syntax', i.e.: not only do the words look the same, hte
meaning is also roughly the same. Which I made a point about.

If this is going to decend into rudeness, then I'll drop out - I try
not to be unpleasant when I post, and I don't enjoy discussions where
insults are thrown out rather than actual constructive criticism. Yes,
you could say, "well usenet is like that", but it's not all like that
- I've used it for years, and have generally managed to get great
benefit from it whilst avoiding getting involved in exchanges like
that.

sorry but I find debates where people throw in 'but that's semantics' a
total waste of my time.

You say SQL is set oriented and Linq is sequence oriented. If you are
speaking of SQL as a language and Linq as a language, then the fact
that the languages differ in meaning even when they superficially look
the same is a semantic difference. If you have a different definition
of semantic I'd be interested to hear it.

if someone says it's just sematics, to me it means that there are just
tiny differences and it's up to semantics (I'm not a native english
speaker), which in this case is VERY WRONG.

perhaps I should stop warning people that Linq queries aren't sql
queries and shouldn't be seen as such, and just let them plow on and
wondering why their code is so dogslow in production.

The core point of many problems with linq is that people think they
understand linq queries because they understand SQL. That's a mistake.
Linq doesn't work the same way. I try to explain why this is by
explaining that SQL is set oriented and linq is sequence oriented, but
it's getting pretty painful, to be honest.

FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------

Gareth Erskine-Jones · Dec 7, 2008

I didn't understand what you meant with the remark, hence my '?'. I
know what 'syntax' means, though I also think I know what people mean
with 'sql like syntax', i.e.: not only do the words look the same, hte
meaning is also roughly the same. Which I made a point about.

sorry but I find debates where people throw in 'but that's semantics' a
total waste of my time.

if someone says it's just sematics, to me it means that there are just
tiny differences and it's up to semantics (I'm not a native english
speaker), which in this case is VERY WRONG.

With respect, I think you've misunderstood an english idiom. People
often say "that's just semantics" meaning - that the differences are
trivial. I however was using the word in it's real sense - the grammar
of a language consists of both syntax and symantecs. Syntax is the set
of rules that say how sentences can be constructed to be valid
sentences. Symantics are the rules that relate the sentences to
reality - the "meaning" of a sentence.
I didn't say "it's just semantics"

perhaps I should stop warning people that Linq queries aren't sql
queries and shouldn't be seen as such, and just let them plow on and
wondering why their code is so dogslow in production.

The core point of many problems with linq is that people think they
understand linq queries because they understand SQL. That's a mistake.
Linq doesn't work the same way. I try to explain why this is by
explaining that SQL is set oriented and linq is sequence oriented, but
it's getting pretty painful, to be honest.

I think I agree- I just had a problem with that particular example I
guess.

Jon Skeet [C# MVP] · Dec 7, 2008

With respect, I think you've misunderstood an english idiom. People
often say "that's just semantics" meaning - that the differences are
trivial. I however was using the word in it's real sense - the grammar
of a language consists of both syntax and symantecs. Syntax is the set
of rules that say how sentences can be constructed to be valid
sentences. Symantics are the rules that relate the sentences to
reality - the "meaning" of a sentence.
I didn't say "it's just semantics"

That was exactly my point as well (although I'm intrigued as to why
you've placed a "y" as the second letter several times - it looks
deliberate rather than a typo, and I'd be interested to know more).

It would be quite possible to write a language with the same
*semantics* as SQL but with completely different *syntax*.

The interesting thing with LINQ is that we have the opportunity to do
the equivalent within C#. The *semantics* of

from x in source where x.Name == "fred" select x.Age

are *exactly* the same as

source.Where(x => x.Name).Select(x => x.Age)

but the *syntax* is clearly different.

I applaud your stand to distinguish between semantics and syntax

(I share Frans Bouma's frustration with those who *do* say "that's just
semantics" when it's the semantics which are actually being discussed,
mind you. Playing the "semantics" card when discussing the difference
between passing a reference by value and passing an object by reference
is awful. Anyway, I digress and it's late...)

Frans Bouma [C# MVP] · Dec 8, 2008

Gareth said:
With respect, I think you've misunderstood an english idiom. People
often say "that's just semantics" meaning - that the differences are
trivial.

That was indeed my interpretation of your sentence.

FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------

Gareth Erskine-Jones · Dec 8, 2008

That was exactly my point as well (although I'm intrigued as to why
you've placed a "y" as the second letter several times - it looks
deliberate rather than a typo, and I'd be interested to know more).

Heh - only just noticed that. No, I did just mean semantics. It is
just a (repeated) typo.

It would be quite possible to write a language with the same
*semantics* as SQL but with completely different *syntax*.

The interesting thing with LINQ is that we have the opportunity to do
the equivalent within C#. The *semantics* of

from x in source where x.Name == "fred" select x.Age

are *exactly* the same as

source.Where(x => x.Name).Select(x => x.Age)

but the *syntax* is clearly different.

I applaud your stand to distinguish between semantics and syntax

(I share Frans Bouma's frustration with those who *do* say "that's just
semantics" when it's the semantics which are actually being discussed,
mind you.

Yes, it's a pretty sloppy use of english. I don't know where you're
based, but in the UK sports presenters (who's output I usually ignore)
will often say of some detail of a sports match, "but that's purely
academic" - meaning, I presume, "that's purely of academic interest" -
which I guess boils down to "that doesn't really matter".

Linq-to-Sql var null question or observation	52	Dec 25, 2010
Linq Style	4	Dec 16, 2008
Is linq the final straw for VB?	92	Feb 14, 2009
C# and javascript? Or C# and Javascript + IE?! I really dont know...	1	Oct 5, 2007
Fix for redirected documents folder not permitting user to installsoftware.	0	Mar 28, 2010
OT\| The Decline and Fall of the American Empire: Four Scenarios for the End of the American Century	0	Dec 7, 2010
The Man Who Spilled the Secrets =haxored	5	Jan 7, 2011
FWT Newsletter - Weekly - October 11, 2004	0	Oct 11, 2004

To LinQ or not LinQ

Gareth Erskine-Jones

Gareth Erskine-Jones

Gareth Erskine-Jones

Frans Bouma [C# MVP]

Frans Bouma [C# MVP]

Frans Bouma [C# MVP]

Gareth Erskine-Jones

Gareth Erskine-Jones

Frans Bouma [C# MVP]

Frans Bouma [C# MVP]

Gareth Erskine-Jones

Jon Skeet [C# MVP]

Frans Bouma [C# MVP]

Gareth Erskine-Jones

Ask a Question

Similar Threads