Why is Linq slower?

P

paululvinius

Hi!

Testing som Linq-expressions and tried to measure performance and
compare it to pre-Linq programming.

The folloing two methods are functional equal but the non-Linq one is
twice as fast.

public List<ConferenceRoom> OldWay(int minimumSeatingCapacity)
{
List<ConferenceRoom> availableRooms = new List<ConferenceRoom>();

foreach (ConferenceRoom room in rooms)
{
if (room.SeatingCapacity >= minimumSeatingCapacity)
availableRooms.Add(room);
}

return availableRooms;
}

public List<ConferenceRoom> NewWay(int minimumSeatingCapacity)
{
return (from room in rooms
where room.SeatingCapacity >= minimumSeatingCapacity
select room).ToList();
}

PS.
Just did a plain and simple:

DateTime start = DateTime.Now;
for (int i = 0; i < 1000000; i++)
rooms.OldWay(8);

Console.WriteLine((DateTime.Now - start).Ticks);

....to mesaure the performance (Did repeate the test over and over).

Results:
Non-Linq: ca. 6 300 000 ticks
Linq: ca. 3 100 000

Why is Linq slower?

Thanks
/Paul
 
P

Paul

Sorry, just posted a reply saying that I mixed the results.

Its supposed to be vice-verca.

thanks
 
W

Willy Denoyette [MVP]

Hi!

Testing som Linq-expressions and tried to measure performance and
compare it to pre-Linq programming.

The folloing two methods are functional equal but the non-Linq one is
twice as fast.

public List<ConferenceRoom> OldWay(int minimumSeatingCapacity)
{
List<ConferenceRoom> availableRooms = new List<ConferenceRoom>();

foreach (ConferenceRoom room in rooms)
{
if (room.SeatingCapacity >= minimumSeatingCapacity)
availableRooms.Add(room);
}

return availableRooms;
}

public List<ConferenceRoom> NewWay(int minimumSeatingCapacity)
{
return (from room in rooms
where room.SeatingCapacity >= minimumSeatingCapacity
select room).ToList();
}

PS.
Just did a plain and simple:

DateTime start = DateTime.Now;
for (int i = 0; i < 1000000; i++)
rooms.OldWay(8);

Console.WriteLine((DateTime.Now - start).Ticks);

...to mesaure the performance (Did repeate the test over and over).

Results:
Non-Linq: ca. 6 300 000 ticks
Linq: ca. 3 100 000

Why is Linq slower?

Thanks
/Paul



Impossible to tell, unless you post the whole code.

Willy.
 
J

Jon Skeet [C# MVP]

Testing som Linq-expressions and tried to measure performance and
compare it to pre-Linq programming.

The folloing two methods are functional equal but the non-Linq one is
twice as fast.

It's probably due to the overhead in calling the delegate.

The question is whether it's actually significant. Is this a
performance bottleneck for you? Is the extra third of a second for a
million iterations going to hurt you?

Jon
 
P

Paul

You are probably right, the extra third of a second for a
million iterations will hardly be an issue. Just wanted to know if there is
a general overhead when using Linq.

My initial assumption was that it would be as fast (why wouldn't it), or
actually in some cases even faster (since Linq is a higher level programmatic
expression, it will allow the compiler to interpret what you really are
trying to do and then choose the most optimal way to do it (where we lazy
programmers sometimes can be careless)).

/Paul
 
J

Jon Skeet [C# MVP]

Paul said:
You are probably right, the extra third of a second for a
million iterations will hardly be an issue. Just wanted to know if there is
a general overhead when using Linq.

My initial assumption was that it would be as fast (why wouldn't it)

Because it's got an extra delegate call.
or actually in some cases even faster (since Linq is a higher level
programmatic expression, it will allow the compiler to interpret what
you really are trying to do and then choose the most optimal way to
do it (where we lazy programmers sometimes can be careless)).

I don't think you understand how LINQ works. The compiler first
translates this code:

(I've renamed minimumSeatingCapacity to minCapacity for the sake of
easy formatting)

return (from room in rooms
where room.SeatingCapacity >= minCapacity
select room).ToList();

into this code:

return rooms.Where(room => room.SeatingCapacity >= minCapacity)
.ToList();

Now, given the rest of your code, I strongly suspect "rooms" is a list
or something else which implements IEnumerable<T> but not
IQueryable<T>. That means the compiler will end up using the extension
methods from Enumerable, which take delegates - so the lambda
expression will be compiled into a delegate. That delegate is passed to
the Where method and executed for each element, as ToList() requests it
from the IEnumerable<T> returned by Where.

The compiler doesn't understand the meaning of any of this. It doesn't
have any idea what Where is meant to do, or ToList. It just follows the
rules in the language spec.

Basically, there's quite a lot more going on than in the more direct
version. Frankly I'm very impressed that it manages to work as fast as
it can!


Now, if we'd been dealing with expression trees, *that's* where
optimisation could have come into play - but at runtime, when the LINQ
provider in question might have created an optimised SQL query if it
were talking to a database, for instance.
 
N

Nicholas Paldino [.NET/C# MVP]

Paul,

In addition to the other comments, you have to remember that when you
perform the call to ToList, which will ultimately iterate through your
query, you are calling three implementations of the methods on
IEnumerable<T>.

The first is in the original implementation of IEnumerable<T> on rooms.

The second is in the call to the Where extension method, which returns
an IEnumerable<T> which calls the original enumerator and applies the filter
(it is applied in the MoveNext implementation of the IEnumerable<T>
returned).

The last is the call to the Select extension method, which performs the
transformation (if there is one), but still calls through to the IEnumerable
returned by the Where extension method.

The more you filter/transform/sort/any other operation, the larger the
composition of IEnumerable<T> implementations that you are making. It is
going to affect performance in some way.

And of course, as Jon mentioned, every separate were clause is going to
add another anonymous method, and there is some overhead in calling a
delegate as opposed to a direct method call.
 
J

Jon Skeet [C# MVP]

The last is the call to the Select extension method, which performs the
transformation (if there is one), but still calls through to the IEnumerable
returned by the Where extension method.

Slight point of pedantry (okay, extreme point of pedantry) - because
there's no transformation involved in this case, but there *is* another
clause (so it's not a degenerate query expression) Select doesn't get
called.

<snip>

It doesn't take away from your overall point though :)
And of course, as Jon mentioned, every separate were clause is going to
add another anonymous method, and there is some overhead in calling a
delegate as opposed to a direct method call.

Or indeed not making a call at all, with the code being specified
inline.
 
P

Paul

Interesting. Thank you guys for the info. Just started to dig into this,
after have just red some conceptual info.

Regards
Paul
 
N

Nicholas Paldino [.NET/C# MVP]

Jon,

Oh, look at that, the compiler is smart enough to know when ^not^ to
call the Select method. That's good to know. =)

Of course, it should be mentioned (because we are delving into pedantry
here) that any kind of transformation applied to the items being enumerated
over will add the call to Select, as well as ^another^ delegate call.
 
J

Jon Skeet [C# MVP]

Nicholas Paldino said:
Oh, look at that, the compiler is smart enough to know when ^not^ to
call the Select method. That's good to know. =)

Better than just the compiler - the language spec is smart enough.
That's a really good thing, as it means we get to predict what will
happen whatever compiler we use.

I wouldn't want that level of change to be a compiler-specific thing.
Of course, it should be mentioned (because we are delving into pedantry
here) that any kind of transformation applied to the items being enumerated
over will add the call to Select, as well as ^another^ delegate call.

Unless you do a "join ... into" or a "join" which is followed by a
select, in which case the projection is provided as the final parameter
into the GroupJoin or Join methods. Of course, the delegate still needs
to be called...

(Sorry to be so ridiculous - it just feels like having waded through
the spec for the book, I should get the opportunity to show off every
so often ;)
 
M

Michael S

Paul said:
My initial assumption was that it would be as fast (why wouldn't it), or
actually in some cases even faster (since Linq is a higher level
programmatic
expression, it will allow the compiler to interpret what you really are
trying to do and then choose the most optimal way to do it (where we lazy
programmers sometimes can be careless)).

/Paul

Note that with plinq and multiple CPU, you'd probably see a lot better
performance.
I think I read somewhere that the prototype for plinq, did on avarage 6
times better on a 8 core cpu.

Intresting techonlogy

- Michael Starberg
 
F

Frans Bouma [C# MVP]

Jon said:
It's probably due to the overhead in calling the delegate.
I tested this for a library I'm writing and accessing a property
through a lambda or accessing the property directly and accessing it
through a lambda is roughly 5 times slower. (but still fast, 8ms
(direct access) vs 40ms through the lambda, release builds, 1 million
calls)

Considering the difference in times, roughly twice the speed, it might
be that the lambda indeed is the bottleneck, though there have been
found other bottlenecks in Linq to object's code (concat is O(n^2) for
example according to a recent bugreport) so I wouldn't be surprised if
it is due to something else.

FB


--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------
 
F

Frans Bouma [C# MVP]

Nicholas said:
Paul,

In addition to the other comments, you have to remember that when you
perform the call to ToList, which will ultimately iterate through
your query, you are calling three implementations of the methods on
IEnumerable<T>.

The first is in the original implementation of IEnumerable<T> on
rooms.

no, this is done by the Where call, which is equal to
List said:
The second is in the call to the Where extension method, which
returns an IEnumerable<T> which calls the original enumerator and
applies the filter (it is applied in the MoveNext implementation of
the IEnumerable<T> returned).

The last is the call to the Select extension method, which performs
the transformation (if there is one), but still calls through to the
IEnumerable returned by the Where extension method.

isn't called either :)

The query is simply rooms.FindAll<T>(filter). It would be interesting
to see if that .NET 2.0 code would run as fast as his original foreach
or as fast as the linq query :)

FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------
 
P

Paul

The query is simply rooms.FindAll said:
to see if that .NET 2.0 code would run as fast as his original foreach
or as fast as the linq query :)

The code below actually performs slightly faster than my original foreach...

public static List<ConferenceRoom> FindAllTest(int minimumSeatingCapacity)
{
return rooms.FindAll(delegate(ConferenceRoom room)
{
if (room.SeatingCapacity >= minimumSeatingCapacity)
return true;
return false;
});
}

So, the scoreboard now looks like this (with a faster computer than the one
before):
Foreach w. nested if-statement: 2 500 000 ticks
Linq: 4 500 000 ticks
FindAll: 2 400 000 ticks

(All tests have been repeated several times and the results are based on an
average "score")

Can anyone come up with a faster way? :)
 
N

Nicholas Paldino [.NET/C# MVP]

Frans,

Yes, you are right, Select doesn't get called, but I should have
elaborated and said the first implementation of IEnumerable<T> to return
something is the implementation of IEnumerable<T> on IList<T>, which is then
filtered through the IEnumerable<T> returned by the call to Where, etc, etc.
 
J

Jon Skeet [C# MVP]

Can anyone come up with a faster way? :)

That's pretty tricky without having complete source code. Give us a
proper test harness and we can experiment appropriately.

Jon
 
Top