sliced find in Linq

A

Andrus

I have big table.
How to return next n entities staring from mth entity from this table ie.
to generate select statement

SELECT * FROM customer OFFSET :m LIMIT :n

I tried

var q = from c in db.Customers select k offset 5 limit 6;

in vcse beta 2

but got error.

How to force LINQ to perform such query ?

Andrus.
 
N

Nicholas Paldino [.NET/C# MVP]

Andrus,

You have to remember that LINQ is not SQL. However, what you want to do
is very easy. First, you start off with your base query:

// The base query.
var q = from c in db.Customers;

Of course, this is the same as the Customers collection, so you can just
do that. From there, you want to use the Skip extension method, passing in
m. Then, you want to use the Take extension method, passing in n, like so:

var q = db.Customers.Skip(m).Take(n);

From there, you can enumerate q.

I notice that you are using the Customers property off a variable named
db. You need to be aware that (AFAIK) this is not going to get translated
into SQL to get the information from the underlying data source for the
Customers instances. It will load all the information from the database,
then skip over the first m elements, and then take the next n.
 
C

Carl Daniel [VC++ MVP]

Nicholas said:
Andrus,

You have to remember that LINQ is not SQL. However, what you want
to do is very easy. First, you start off with your base query:

// The base query.
var q = from c in db.Customers;

Of course, this is the same as the Customers collection, so you
can just do that. From there, you want to use the Skip extension
method, passing in m. Then, you want to use the Take extension
method, passing in n, like so:
var q = db.Customers.Skip(m).Take(n);

....but be aware that in all likelihood, this will return the entire query
result into memory, then discard the first m rows, and finally retun the
next n rows. The select ... offset... limit is Oracle-specific SQL syntax,
so I wouldn't assume that Linq to SQL knows about it.

-cd
 
J

Jon Skeet [C# MVP]

Carl Daniel [VC++ MVP]
...but be aware that in all likelihood, this will return the entire query
result into memory, then discard the first m rows, and finally retun the
next n rows. The select ... offset... limit is Oracle-specific SQL syntax,
so I wouldn't assume that Linq to SQL knows about it.

Fortunately it's easy to check this, and in LINQ to SQL against SQL
Server 2005 at least, it works well. I tried this:

var q = db.Customers.Skip(3).Take(5);

foreach (Customer cust in q)
{
Console.WriteLine (cust.CompanyName);
}

and the SQL executed was:

SELECT [t1].[CustomerID], [t1].[CompanyName], [t1].[ContactName], [t1].
[ContactTitle], [t1].[Address], [t1].[City], [t1].[Region], [t1].
[PostalCode], [t1].[Country], [t1].[Phone], [t1].[Fax]
FROM (
SELECT ROW_NUMBER() OVER (ORDER BY [t0].[CustomerID], [t0].
[CompanyName], [t0].[ContactName], [t0].[ContactTitle], [t0].[Address],
[t0].[City], [t0].[Region], [t0].[PostalCode], [t0].[Country], [t0].
[Phone], [t0].[Fax]) AS [ROW_NUMBER], [t0].[CustomerID], [t0].
[CompanyName], [t0].[ContactName], [t0].[ContactTitle],
[t0].[Address], [t0].[City], [t0].[Region], [t0].[PostalCode], [t0].
[Country],[t0].[Phone], [t0].[Fax]
FROM [dbo].[Customers] AS [t0]
) AS [t1]
WHERE [t1].[ROW_NUMBER] BETWEEN @p0 + 1 AND @p0 + @p1
ORDER BY [t1].[ROW_NUMBER]


It's not exactly neat, but the query plan looks reasonable, and it
certainly avoids returning everything to the client.

(Note that even if it *did* do everything in the client, it wouldn't
have to load all the results into memory before starting to skip
things.)
 
M

Marc Gravell

A minor aside; IMO, this discussion highlights (to me) one of the key
benefits of switching to LINQ; as new server features become
available, it becomes the job of LINQ to worry about using it. For
instance, we currently have a combination of SQL 2000 and 2005 (with a
plan to move all to 2008 when it is RTM). Now, if I use the above
query on my 2000 boxes, it will probably use some combination of TOP
etc. However, when I upgrade to 2008 I fully expect it to rearrange
things without me having to worry about it - which (when you have a
large system) is a very big thing.

Can't wait for the RTM ;-p
 
F

Frans Bouma [C# MVP]

Carl said:
...but be aware that in all likelihood, this will return the entire
query result into memory, then discard the first m rows, and finally
retun the next n rows. The select ... offset... limit is
Oracle-specific SQL syntax, so I wouldn't assume that Linq to SQL
knows about it.

IMHO, oracle doesn't have OFFSET and LIMIT operators for SELECT, if
you want to perform a paging query in oracle, you have to refer to the
row number virtual column present in every resultset. :)

FB


--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------
 
F

Frans Bouma [C# MVP]

Marc said:
A minor aside; IMO, this discussion highlights (to me) one of the key
benefits of switching to LINQ; as new server features become
available, it becomes the job of LINQ to worry about using it. For
instance, we currently have a combination of SQL 2000 and 2005 (with a
plan to move all to 2008 when it is RTM). Now, if I use the above
query on my 2000 boxes, it will probably use some combination of TOP
etc. However, when I upgrade to 2008 I fully expect it to rearrange
things without me having to worry about it - which (when you have a
large system) is a very big thing.

Every decent O/R mapper offers pagination over resultsets. For
SQLSERVER 2000, it will likely be a temptable approach, and although
the query posted earlier in the thread looks decent, a CTE would have
been more efficient IMHO.

(and you can create a generic paging query on sqlserver 2005 with a
CTE which always works without a lot of sql concatination. See:
http://weblogs.asp.net/fbouma/archive/2007/06/05/sqlserver-2005-paging-t
here-is-a-generic-wrapper-query-possible.aspx

FB

ps: 'skip' and 'take' aren't ideal. It would have been better if they
would have implemented a .TakePage(pagenumber, pagesize) extension
method instead, because paging is done in pages of every time the same
size (you're fetching page 1, 2, 3, etc. which all have the same size).
With skip/take you can have a skip number which isn't a product of
pagesize (the take parameter) * the page number to fetch -1


--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------
 
M

Marc Gravell

Every decent O/R mapper offers pagination over resultsets.
You are of course, right. I was comparing to the DIY approach, which
up until now has often served me well, to give maximum flexibilty (for
optimisation etc). But the downside is that you have more plumbing to
maintain.
I have looked at various ORM tools in the past (NHibernate, etc), and
LINQ is the first one I have seen that truly does what I want (i.e.
pretty much what I would have coded in SQL, with highly specific and
efficient queries), without me having to fight the ORM tool. I'm a
convert, and I suspect a great many other people will be too...
It would have been better if they would have implemented a
.TakePage(pagenumber, pagesize) extension method instead

But that is the great thing about extension methods and the composable
nature of LINQ expressions... roll your own ;-p

public static IQueryable<T> TakePage<T>(this IQueryable<T>
query,
int pageIndex, int pageSize) {
return query.Skip(pageSize * pageIndex).Take(pageSize);
}

Then you can just combine as you like:
var query = (from cust in db.Customers
select cust).TakePage(3, 10);

Marc
 
C

Carl Daniel [VC++ MVP]

Jon said:
Carl Daniel [VC++ MVP]
...but be aware that in all likelihood, this will return the entire
query result into memory, then discard the first m rows, and finally
retun the next n rows. The select ... offset... limit is
Oracle-specific SQL syntax, so I wouldn't assume that Linq to SQL
knows about it.

Fortunately it's easy to check this, and in LINQ to SQL against SQL
Server 2005 at least, it works well. I tried this:

var q = db.Customers.Skip(3).Take(5);

foreach (Customer cust in q)
{
Console.WriteLine (cust.CompanyName);
}

and the SQL executed was:

SELECT [t1].[CustomerID], [t1].[CompanyName], [t1].[ContactName],
[t1]. [ContactTitle], [t1].[Address], [t1].[City], [t1].[Region],
[t1]. [PostalCode], [t1].[Country], [t1].[Phone], [t1].[Fax]
FROM (
SELECT ROW_NUMBER() OVER (ORDER BY [t0].[CustomerID], [t0].
[CompanyName], [t0].[ContactName], [t0].[ContactTitle],
[t0].[Address], [t0].[City], [t0].[Region], [t0].[PostalCode],
[t0].[Country], [t0]. [Phone], [t0].[Fax]) AS [ROW_NUMBER],
[t0].[CustomerID], [t0]. [CompanyName], [t0].[ContactName],
[t0].[ContactTitle], [t0].[Address], [t0].[City], [t0].[Region],
[t0].[PostalCode], [t0]. [Country],[t0].[Phone], [t0].[Fax]
FROM [dbo].[Customers] AS [t0]
) AS [t1]
WHERE [t1].[ROW_NUMBER] BETWEEN @p0 + 1 AND @p0 + @p1
ORDER BY [t1].[ROW_NUMBER]


It's not exactly neat, but the query plan looks reasonable, and it
certainly avoids returning everything to the client.

Cool - I'm impressed. Unfortunately I didn't have an Orcas installation to
test it on, so I'm glad to see that I was wrong about that.

The only undesirable thiing I see in the SQL above is that order by over all
of the returned columns. If Customers is a large table, this will result in
SQL server doing the sort in tempdb. Since the original query was
unordered, I'd like the above SQL to simply order by the primary key, since
that alone must define a unique ordering, which is all that's required for
this to work. (Perhaps your table doesn't have a primary key though?)

-cd
 
C

Carl Daniel [VC++ MVP]

Frans said:
IMHO, oracle doesn't have OFFSET and LIMIT operators for SELECT, if
you want to perform a paging query in oracle, you have to refer to the
row number virtual column present in every resultset. :)

Hmm... I know I've seen that extension somewhere, I guess it wasn't Oracle
though. :) MySQL maybe?

-cd
 
C

Carl Daniel [VC++ MVP]

Jon said:
Carl Daniel [VC++ MVP]
...but be aware that in all likelihood, this will return the entire
query result into memory, then discard the first m rows, and finally
retun the next n rows. The select ... offset... limit is
Oracle-specific SQL syntax, so I wouldn't assume that Linq to SQL
knows about it.

Fortunately it's easy to check this, and in LINQ to SQL against SQL
Server 2005 at least, it works well. I tried this:

var q = db.Customers.Skip(3).Take(5);

foreach (Customer cust in q)
{
Console.WriteLine (cust.CompanyName);
}

and the SQL executed was:

SELECT [t1].[CustomerID], [t1].[CompanyName], [t1].[ContactName],
[t1]. [ContactTitle], [t1].[Address], [t1].[City], [t1].[Region],
[t1]. [PostalCode], [t1].[Country], [t1].[Phone], [t1].[Fax]
FROM (
SELECT ROW_NUMBER() OVER (ORDER BY [t0].[CustomerID], [t0].
[CompanyName], [t0].[ContactName], [t0].[ContactTitle],
[t0].[Address], [t0].[City], [t0].[Region], [t0].[PostalCode],
[t0].[Country], [t0]. [Phone], [t0].[Fax]) AS [ROW_NUMBER],
[t0].[CustomerID], [t0]. [CompanyName], [t0].[ContactName],
[t0].[ContactTitle], [t0].[Address], [t0].[City], [t0].[Region],
[t0].[PostalCode], [t0]. [Country],[t0].[Phone], [t0].[Fax]
FROM [dbo].[Customers] AS [t0]
) AS [t1]
WHERE [t1].[ROW_NUMBER] BETWEEN @p0 + 1 AND @p0 + @p1
ORDER BY [t1].[ROW_NUMBER]


It's not exactly neat, but the query plan looks reasonable, and it
certainly avoids returning everything to the client.

Cool - I'm impressed. Unfortunately I didn't have an Orcas installation to
test it on, so I'm glad to see that I was wrong about that.

The only undesirable thiing I see in the SQL above is that order by over all
of the returned columns. If Customers is a large table, this will result in
SQL server doing the sort in tempdb. Since the original query was
unordered, I'd like the above SQL to simply order by the primary key, since
that alone must define a unique ordering, which is all that's required for
this to work. (Perhaps your table doesn't have a primary key though?)

-cd
 
J

Jon Skeet [C# MVP]

ps: 'skip' and 'take' aren't ideal. It would have been better if they
would have implemented a .TakePage(pagenumber, pagesize) extension
method instead, because paging is done in pages of every time the same
size (you're fetching page 1, 2, 3, etc. which all have the same size).
With skip/take you can have a skip number which isn't a product of
pagesize (the take parameter) * the page number to fetch -1

To me that's the benefit of having Skip and Take separately. As Marc
said, you can build TakePage out of Skip and Take easily, but you
*can't* build Skip and Take out of just TakePage.

Suppose you want to display pages with the top 3 entries of one page
being the bottom 3 of the previous page. At that point you get:

Page 1: 1-10
Page 2: 8-17
Page 3: 15-24

etc

In that case you exactly *don't* want the skip number to be a product
of the pagesize.
 
W

Willy Denoyette [MVP]

Jon Skeet said:
Carl Daniel [VC++ MVP]
...but be aware that in all likelihood, this will return the entire query
result into memory, then discard the first m rows, and finally retun the
next n rows. The select ... offset... limit is Oracle-specific SQL
syntax,
so I wouldn't assume that Linq to SQL knows about it.

Fortunately it's easy to check this, and in LINQ to SQL against SQL
Server 2005 at least, it works well. I tried this:

var q = db.Customers.Skip(3).Take(5);

foreach (Customer cust in q)
{
Console.WriteLine (cust.CompanyName);
}

and the SQL executed was:

SELECT [t1].[CustomerID], [t1].[CompanyName], [t1].[ContactName], [t1].
[ContactTitle], [t1].[Address], [t1].[City], [t1].[Region], [t1].
[PostalCode], [t1].[Country], [t1].[Phone], [t1].[Fax]
FROM (
SELECT ROW_NUMBER() OVER (ORDER BY [t0].[CustomerID], [t0].
[CompanyName], [t0].[ContactName], [t0].[ContactTitle], [t0].[Address],
[t0].[City], [t0].[Region], [t0].[PostalCode], [t0].[Country], [t0].
[Phone], [t0].[Fax]) AS [ROW_NUMBER], [t0].[CustomerID], [t0].
[CompanyName], [t0].[ContactName], [t0].[ContactTitle],
[t0].[Address], [t0].[City], [t0].[Region], [t0].[PostalCode], [t0].
[Country],[t0].[Phone], [t0].[Fax]
FROM [dbo].[Customers] AS [t0]
) AS [t1]
WHERE [t1].[ROW_NUMBER] BETWEEN @p0 + 1 AND @p0 + @p1
ORDER BY [t1].[ROW_NUMBER]


It's not exactly neat, but the query plan looks reasonable, and it
certainly avoids returning everything to the client.

(Note that even if it *did* do everything in the client, it wouldn't
have to load all the results into memory before starting to skip
things.)



On my box, the "syntesized" select stmt looks looks like this :

SELECT TOP 5 [t1].[CustomerID], [t1].[TerritoryID]
FROM (
SELECT ROW_NUMBER() OVER (ORDER BY [t0].[CustomerID],
[t0].[TerritoryID]) AS [ROW_NUMBER], [t0].[CustomerID], [t0].[TerritoryID]
FROM [Sales].[Customer] AS [t0]
) AS [t1]
WHERE [t1].[ROW_NUMBER] > @p0',N'@p0 int',@p0=3

notice the SELECT TOP 5 expression and the (obviously) different WHERE
clause.

What version of the framework are you running against?


Willy.
 
J

Jon Skeet [C# MVP]

On my box, the "syntesized" select stmt looks looks like this :

SELECT TOP 5 [t1].[CustomerID], [t1].[TerritoryID]
FROM (
SELECT ROW_NUMBER() OVER (ORDER BY [t0].[CustomerID],
[t0].[TerritoryID]) AS [ROW_NUMBER], [t0].[CustomerID], [t0].[TerritoryID]
FROM [Sales].[Customer] AS [t0]
) AS [t1]
WHERE [t1].[ROW_NUMBER] > @p0',N'@p0 int',@p0=3

notice the SELECT TOP 5 expression and the (obviously) different WHERE
clause.
What version of the framework are you running against?

Which SQL Server version are you running against?

I suspect it's more likely to be a SQL Server change than a framework
change.

My select statement was created against SQL Server Express 2005.
 
A

Andrus

IMHO, oracle doesn't have OFFSET and LIMIT operators for SELECT, if
you want to perform a paging query in oracle, you have to refer to the
row number virtual column present in every resultset. :)

I used PostgreSQL database syntax.
NHibernate generates sql statement with LIMIT and OFFSET clauses when
SlicedFindAll() is used.

LinQ provider for PostgreSQL crashes on Skip() and/or Take() functions so it
is not possible to use them.


Andrus.
 
A

Andrus

To me that's the benefit of having Skip and Take separately. As Marc
said, you can build TakePage out of Skip and Take easily, but you
*can't* build Skip and Take out of just TakePage.

Suppose you want to display pages with the top 3 entries of one page
being the bottom 3 of the previous page. At that point you get:

Page 1: 1-10
Page 2: 8-17
Page 3: 15-24

Where to find Winforms sample which uses paging ?

I'm struggling with MSDN DataGridView virtual mode sample.
When I add need row in datagridview or delete row, this sample cache
provider returns index out of range error.
I think I need to adjust cache pages when row is added or deleted from grid.

Andrus.
 
W

Willy Denoyette [MVP]

Jon Skeet said:
On my box, the "syntesized" select stmt looks looks like this :

SELECT TOP 5 [t1].[CustomerID], [t1].[TerritoryID]
FROM (
SELECT ROW_NUMBER() OVER (ORDER BY [t0].[CustomerID],
[t0].[TerritoryID]) AS [ROW_NUMBER], [t0].[CustomerID],
[t0].[TerritoryID]
FROM [Sales].[Customer] AS [t0]
) AS [t1]
WHERE [t1].[ROW_NUMBER] > @p0',N'@p0 int',@p0=3

notice the SELECT TOP 5 expression and the (obviously) different WHERE
clause.
What version of the framework are you running against?

Which SQL Server version are you running against?

I suspect it's more likely to be a SQL Server change than a framework
change.

My select statement was created against SQL Server Express 2005.



What makes you think the server version would make any difference?. The
T-SQL statement is produced at the client (by the Linq To SQL query
provider) and sent to the server. I ran this against SQL2005 Express and
SQL2005 Standard edition, both show the (obviously) same T-SQL statement.
Even using this against a SQL2000 instance works equally well.

Willy.
 
J

Jon Skeet [C# MVP]

Andrus said:
Where to find Winforms sample which uses paging ?

I don't know, I'm afraid.
I'm struggling with MSDN DataGridView virtual mode sample.
When I add need row in datagridview or delete row, this sample cache
provider returns index out of range error.
I think I need to adjust cache pages when row is added or deleted from grid.

Quite possibly. I'm afraid I'm relatively inexperienced with WinForms
examples, particularly when they're retrieving data directly from
databases - I normally go through another layer (e.g. a web service).
 
J

Jon Skeet [C# MVP]

Willy Denoyette said:
What makes you think the server version would make any difference?

Frans has mentioned before that LINQ to SQL will occasionally make
changes based on which type of database it's talking to (a *very*
limited portability layer!) and I thought that might be happening here.
The T-SQL statement is produced at the client (by the Linq To SQL query
provider) and sent to the server. I ran this against SQL2005 Express and
SQL2005 Standard edition, both show the (obviously) same T-SQL statement.
Ah.

Even using this against a SQL2000 instance works equally well.

That's good to know.
 
W

Willy Denoyette [MVP]

Jon Skeet said:
Frans has mentioned before that LINQ to SQL will occasionally make
changes based on which type of database it's talking to (a *very*
limited portability layer!) and I thought that might be happening here.

But the client doesn't even know what DB he's talking against (well he
assumes it's SQL Server) even before he even sends the first T-SQ, granted
the SQL server may transform the T-SQL statement internally, but what I'm
after is the SQL produced by the provider.
Following is a complete sample (using the ADVWorks sample DB), that
illustrates my point. Here I'm displaying the T-SQL to the console, but I
don't even attach to a DB (sure I'm authenticating against the server
instance).

using System;
using System.Data.Linq;
using System.Linq;
using System.Data.Linq.Mapping;

[Table(Name="Sales.Customer")]
class Customer
{
[Column(IsPrimaryKey=true)]
public int CustomerID;
[Column]
public int TerritoryID;
}
class NWCtxt : DataContext
{
public NWCtxt (string conStr) : base(conStr)
{}
public Table<Customer> Customers;
}

class Program
{
static void Main()
{
NWCtxt ctx = new NWCtxt
("server=.\\SQLEXPRESS;database=AdventureWorks;integrated security=sspi");
var query = ctx.Customers.Skip(3).Take(5);
Console.WriteLine(ctx.GetCommand(query).CommandText);
}
}



That's good to know.

Well, actually SQL2000 isn't officiallly supoorted (which means not tested),
but I would be surprised if there were more issues with Linq-SQL than when
compared to SQL2005.

Willy.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top