Why is Linq slower?

Paul · Jan 31, 2008

The reason FindAll was faster probably was because it uses a for-loop instead
of a foreach-loop (I guess its because the foreach requires that extra call
for GetEnumerator() -method). So if I switch to a for-loop in my original
foreach-based code, I get around 1 800 000 ticks. So the good old reliable
for-loop might be the choice for those hardcode fps-game-programmers (for the
ones who dare to come out of the C++ closet that is). For the rest of us, I
guess it couldn't matter less

Paul · Jan 31, 2008

That's pretty tricky without having complete source code. Give us a

proper test harness and we can experiment appropriately.

Sure, as you can see i have isolated the important parts.
Paste in a new Console-app:

class Program
{
private static List<ConferenceRoom> rooms = new List<ConferenceRoom>
{
new ConferenceRoom{SeatingCapacity = 3},
new ConferenceRoom{SeatingCapacity = 5},
new ConferenceRoom{SeatingCapacity = 7},
new ConferenceRoom{SeatingCapacity = 9},
new ConferenceRoom{SeatingCapacity = 12}
};

static void Main(string[] args)
{
DateTime start = DateTime.Now;

for(int i = 0; i < 1000000; i++)
ForWithIf(8);

Console.WriteLine((DateTime.Now - start).Ticks);

Console.Read();
}

//Ca: 2 500 000 ticks
public static List<ConferenceRoom> ForeachTest(int minimumSeatingCapacity)
{
List<ConferenceRoom> availableRooms = new List<ConferenceRoom>();

foreach (ConferenceRoom room in rooms)
{
if (room.SeatingCapacity >= minimumSeatingCapacity)
availableRooms.Add(room);
}

return availableRooms;
}

//Ca: 4 500 000 ticks
public static List<ConferenceRoom> LinqTest(int minimumSeatingCapacity)
{
return (from room in rooms
where room.SeatingCapacity >= minimumSeatingCapacity
select room).ToList();
}

//Ca: 2 400 000 ticks
public static List<ConferenceRoom> FindAllTest(int minimumSeatingCapacity)
{
return rooms.FindAll(delegate(ConferenceRoom room)
{
if (room.SeatingCapacity >= minimumSeatingCapacity)
return true;
return false;
});
}

//Ca: 1 800 000 ticks
public static List<ConferenceRoom> ForTest(int minimumSeatingCapacity)
{
List<ConferenceRoom> list = new List<ConferenceRoom>();
for (int i = 0; i < rooms.Count; i++)
{
if (rooms.SeatingCapacity >= minimumSeatingCapacity)
list.Add(rooms);
}
return list;
}
}

public class ConferenceRoom
{
private int seatingCapacity;

public int SeatingCapacity
{
get { return seatingCapacity; }
set { seatingCapacity = value; }
}
}

Jon Skeet [C# MVP] · Jan 31, 2008

The reason FindAll was faster probably was because it uses a for-loop instead
of a foreach-loop (I guess its because the foreach requires that extra call
for GetEnumerator() -method). So if I switch to a for-loop in my original
foreach-based code, I get around 1 800 000 ticks. So the good old reliable
for-loop might be the choice for those hardcode fps-game-programmers (for the
ones who dare to come out of the C++ closet that is). For the rest of us, I
guess it couldn't matter less

Alternatively, use foreach but don't call ToList() at the end of the
LINQ expression. That avoids making a copy to a new list at all, which
may well save rather a lot of time. Only a benchmark would tell, of
course

Jon

Marc Gravell · Jan 31, 2008

By my measurement, most of the LINQ time is in the ToList()... I
changed the return (for all) to be IEnumerable<ConferenceRoom>, and
ran with that:

ForeachTest: 1,376,752
LinqTest: 0,856,144
LinqListTest: 4,174,352
FindAllTest: 1,632,382
ForTest: 0,844,985

(where:

Paul said:
That's pretty tricky without having complete source code. Give us a
proper test harness and we can experiment appropriately.

Click to expand...

Sure, as you can see i have isolated the important parts.
Paste in a new Console-app:

class Program
{
private static List<ConferenceRoom> rooms = new List<ConferenceRoom>
{
new ConferenceRoom{SeatingCapacity = 3},
new ConferenceRoom{SeatingCapacity = 5},
new ConferenceRoom{SeatingCapacity = 7},
new ConferenceRoom{SeatingCapacity = 9},
new ConferenceRoom{SeatingCapacity = 12}
};

static void Main(string[] args)
{
DateTime start = DateTime.Now;

for(int i = 0; i < 1000000; i++)
ForWithIf(8);

Console.WriteLine((DateTime.Now - start).Ticks);

Console.Read();
}

//Ca: 2 500 000 ticks
public static List<ConferenceRoom> ForeachTest(int
minimumSeatingCapacity)
{
List<ConferenceRoom> availableRooms = new List<ConferenceRoom>();

foreach (ConferenceRoom room in rooms)
{
if (room.SeatingCapacity >= minimumSeatingCapacity)
availableRooms.Add(room);
}

return availableRooms;
}

//Ca: 4 500 000 ticks
public static List<ConferenceRoom> LinqTest(int
minimumSeatingCapacity)
{
return (from room in rooms
where room.SeatingCapacity >= minimumSeatingCapacity
select room).ToList();
}

//Ca: 2 400 000 ticks
public static List<ConferenceRoom> FindAllTest(int
minimumSeatingCapacity)
{
return rooms.FindAll(delegate(ConferenceRoom room)
{
if (room.SeatingCapacity >= minimumSeatingCapacity)
return true;
return false;
});
}

//Ca: 1 800 000 ticks
public static List<ConferenceRoom> ForTest(int
minimumSeatingCapacity)
{
List<ConferenceRoom> list = new List<ConferenceRoom>();
for (int i = 0; i < rooms.Count; i++)
{
if (rooms.SeatingCapacity >= minimumSeatingCapacity)
list.Add(rooms);
}
return list;
}
}

public class ConferenceRoom
{
private int seatingCapacity;

public int SeatingCapacity
{
get { return seatingCapacity; }
set { seatingCapacity = value; }
}
}

Marc Gravell · Jan 31, 2008

darn... wrong key:

where:

public static IEnumerable<ConferenceRoom> LinqTest(int
minimumSeatingCapacity) {
return (from room in rooms
where room.SeatingCapacity >= minimumSeatingCapacity
select room);
}

public static IEnumerable<ConferenceRoom> LinqListTest(int
minimumSeatingCapacity) {
return (from room in rooms
where room.SeatingCapacity >= minimumSeatingCapacity
select room).ToList();
}

[Jon]>Only a benchmark would tell, of course

Get out of my mind! ;-p

Marc

Marc Gravell · Jan 31, 2008

(oops - this is actually a continuation of another branch in this
topic - but you get the idea...)

Jon Skeet [C# MVP] · Jan 31, 2008

(oops - this is actually a continuation of another branch in this
topic - but you get the idea...)

You want to bring continuations into the equation as well? Yikes!

Jon

Paul · Jan 31, 2008

But without the ToList() the methods are not functional equal anymore. It
becomes unfair to compare them since the other methods have to go through and
test all 5 rooms 1 million times where the deferred execution one (LinkTest)
doesn't (until we eventually start iterating). Or am I missing something?

Marc Gravell said:
By my measurement, most of the LINQ time is in the ToList()... I
changed the return (for all) to be IEnumerable<ConferenceRoom>, and
ran with that:

ForeachTest: 1,376,752
LinqTest: 0,856,144
LinqListTest: 4,174,352
FindAllTest: 1,632,382
ForTest: 0,844,985

(where:

Paul said:

That's pretty tricky without having complete source code. Give us a
proper test harness and we can experiment appropriately.

Click to expand...

Sure, as you can see i have isolated the important parts.
Paste in a new Console-app:

class Program
{
private static List<ConferenceRoom> rooms = new List<ConferenceRoom>
{
new ConferenceRoom{SeatingCapacity = 3},
new ConferenceRoom{SeatingCapacity = 5},
new ConferenceRoom{SeatingCapacity = 7},
new ConferenceRoom{SeatingCapacity = 9},
new ConferenceRoom{SeatingCapacity = 12}
};

static void Main(string[] args)
{
DateTime start = DateTime.Now;

for(int i = 0; i < 1000000; i++)
ForWithIf(8);

Console.WriteLine((DateTime.Now - start).Ticks);

Console.Read();
}

//Ca: 2 500 000 ticks
public static List<ConferenceRoom> ForeachTest(int
minimumSeatingCapacity)
{
List<ConferenceRoom> availableRooms = new List<ConferenceRoom>();

foreach (ConferenceRoom room in rooms)
{
if (room.SeatingCapacity >= minimumSeatingCapacity)
availableRooms.Add(room);
}

return availableRooms;
}

//Ca: 4 500 000 ticks
public static List<ConferenceRoom> LinqTest(int
minimumSeatingCapacity)
{
return (from room in rooms
where room.SeatingCapacity >= minimumSeatingCapacity
select room).ToList();
}

//Ca: 2 400 000 ticks
public static List<ConferenceRoom> FindAllTest(int
minimumSeatingCapacity)
{
return rooms.FindAll(delegate(ConferenceRoom room)
{
if (room.SeatingCapacity >= minimumSeatingCapacity)
return true;
return false;
});
}

//Ca: 1 800 000 ticks
public static List<ConferenceRoom> ForTest(int
minimumSeatingCapacity)
{
List<ConferenceRoom> list = new List<ConferenceRoom>();
for (int i = 0; i < rooms.Count; i++)
{
if (rooms.SeatingCapacity >= minimumSeatingCapacity)
list.Add(rooms);
}
return list;
}
}

public class ConferenceRoom
{
private int seatingCapacity;

public int SeatingCapacity
{
get { return seatingCapacity; }
set { seatingCapacity = value; }
}
}

Click to expand...

Jon Skeet [C# MVP] · Jan 31, 2008

But without the ToList() the methods are not functional equal anymore. It
becomes unfair to compare them since the other methods have to go through and
test all 5 rooms 1 million times where the deferred execution one (LinkTest)
doesn't (until we eventually start iterating). Or am I missing something?

The important thing is to work out whether you really *need* a list.
Often you won't - you just need to iterate.

So yes, the benchmark for each should iterate over the filtered
results each time.

We should also test with a larger list - say 1000 elements, and
100,000 elements (with fewer iterations, of course), making sure that
a reasonable proportion (e.g. half) match the filter each time.

Jon

Marc Gravell · Jan 31, 2008

D'oh! Yes you are right.... deferred exectuion; the LINQ (without
ToList()) hasn't done anything!

OK, I've returned Count, using property on lists, and
Enumerable.Count() on the LINQ (w/o list), and added a checksum:

ForeachTest: 1,839,284 2000000
LinqTest: 3,210,923 2000000
LinqListTest: 3,919,258 2000000
FindAllTest: 1,342,625 2000000
ForTest: 0,850,867 2000000

So yes, it looks slower (approx factor of 2)... but even so, you are
talking 1s for 1M iterations, and it appears to scale linearly...

Not a great answer, I'll admit...

Marc

Marc Gravell · Jan 31, 2008

Often you won't - you just need to iterate.
But it helps if you *do* iterate... my mistake ;-/

I also ran some tests for Enumerable.Count vs doing the test and
incrementing an accumulator directly (foreach and indexer); the
results are again roughly a factor of 2 out, /probably/ largely due to
the overhead of calling the delegate.

Marc

Marc Gravell · Jan 31, 2008

OK; I did some more tests... the problem appears to not be LINQ as
much as virtual calls...

Your test rig works directly off the List<T> variable, so (since this
is sealed) it can jump the virtcall and go direct to the method. When
all you have is the interface (IList<T>, IEnumerable<T>, etc), then it
cannot do this.

I constructed another test rig just doing a filtered count (no LINQ)
using 10000 iterations over 10000 objects, with all permutations of:
* interface vs concrete List<T>
* foreach vs indexer
* inline test vs predicate

I haven't included the full rig code (quite long) - but I will keep it
for a little while if anybody wants to verify / etc.
Results are console, using stopwatch, with forced GC between runs,
release without debugger (last column is a checksum to prove all
equal):

ConcreteForeachInline 1439ms [39040000]
InterfaceForeachInline 2479ms [39040000]
ConcreteIndexerInline 678ms [39040000]
InterfaceIndexerInline 675ms [39040000]
ConcreteForeachPredicate 2212ms [39040000]
InterfaceForeachPredicate 3239ms [39040000]
ConcreteIndexerPredicate 1445ms [39040000]
InterfaceIndexerPredicate 1453ms [39040000]

This pretty much ties in with what we were seeing for LINQ, in
particular looking at the Foreach results (since LINQ-to-objects
primarily uses IEnumerable<T>), both for inline and predicate usage.

In reality, you are unlikely to ever face a scenario where this is the
actual bottleneck, but I'm glad I understand it a bit better. For
example, if this was critical you could write a few List<T>-specific
extension methods for Where etc, and still use LINQ without a single
change to your code (except perhaps an extra "using" statement). This
would work because of how the compiler works - but the problem would
be what to return from Where etc - if you return IEnumerable<T> you
are back in interface-ville, but you don't necessarily want the memory
overhead of filling another List. But to demonstrate I went for the
latter option and knocked a few together:

Std LINQ Count 4234ms [39040000]
List LINQ Count 2264ms [39040000]

Std LINQ Where ToList 7797ms [39040000]
List LINQ Where ToList 3160ms [39040000]

Does that fill in a few blanks?

Marc

Marc Gravell · Jan 31, 2008

List LINQ Count 2264ms [39040000]

I should clarify that this was a predicate count; the Count() without
predicate is a lot quicker; the standard LINQ implementation has the
good-sense to test for ICollection<T>, so the unfiltered Count() works
blindingly either way; not worth re-implementing...

Marc

Marc Gravell · Jan 31, 2008

Actually, a caffeine low meant I fluffed the test slightly
(InterfaceIndexer* was using List<T>); I'll try and get my act
together!

The good news is that the revised code further demonstrates this
point, highlighting that the difference also applies with indexers:

ConcreteForeachInline 1916ms [39040000]
InterfaceForeachInline 3297ms [39040000]
ConcreteIndexerInline 888ms [39040000]
InterfaceIndexerInline 1655ms [39040000]
ConcreteForeachPredicate 3022ms [39040000]
InterfaceForeachPredicate 4355ms [39040000]
ConcreteIndexerPredicate 1908ms [39040000]
InterfaceIndexerPredicate 2692ms [39040000]

For the record, I would still recommend coding to interfaces; it makes
the code far more reusable and open to refactoring.

I've treble checked my signatures this time:

static int ConcreteForeachInline(List<Foo> foos, int cutoff) {...}
static int InterfaceForeachInline(IEnumerable<Foo> foos, int cutoff)
{...}
static int ConcreteIndexerInline(List<Foo> foos, int cutoff) {...}
static int InterfaceIndexerInline(IList<Foo> foos, int cutoff) {...}
static int ConcreteForeachPredicate(List<Foo> foos, int cutoff) {...}
static int InterfaceForeachPredicate(IEnumerable<Foo> foos, int
cutoff) {...}
static int ConcreteIndexerPredicate(List<Foo> foos, int cutoff) {...}
static int InterfaceIndexerPredicate(IList<Foo> foos, int cutoff)
{...}

Marc

Frans Bouma [C# MVP] · Feb 1, 2008

Jon said:
The important thing is to work out whether you really need a list.
Often you won't - you just need to iterate.

Though in practise, you will return the list. The thing is: do you
want to pass the query around? Remember, the query can contain
expensive resources like a database connection, and you might not want
to expose that to a given tier, or even can't because the receiver is
on another machine.

FB
--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------

Jon Skeet [C# MVP] · Feb 1, 2008

On Feb 1, 8:59 am, "Frans Bouma [C# MVP]"

Though in practise, you will return the list.

To what? It really depends on the precise nature of how this is being
used.

The thing is: do you
want to pass the query around? Remember, the query can contain
expensive resources like a database connection, and you might not want
to expose that to a given tier, or even can't because the receiver is
on another machine.

Well, in this case there's no indication that there's necessarily a
database ever involved, or a different tier. Yes, if you need to
return results to a different tier you're likely to need all the data,
although you may not need to copy it to an actual List<T> first.
Imagine building up some XML to return - you may well only need to
iterate through it to build up the XML - so creating a List<T> would
be a wasteful copy.

It's the classic performance answer, really: "it depends" (on just
about everything).

Jon

Why is Linq slower?

Paul

Paul

Jon Skeet [C# MVP]

Marc Gravell

Marc Gravell

Marc Gravell

Jon Skeet [C# MVP]

Paul

Jon Skeet [C# MVP]

Marc Gravell

Marc Gravell

Marc Gravell

Marc Gravell

Marc Gravell

Frans Bouma [C# MVP]

Jon Skeet [C# MVP]

Ask a Question

Similar Threads