PC Review


Reply
Thread Tools Rate Thread

C# compiler fails to optimize for loop same as foreach

 
 
=?Utf-8?B?TWlrZSBMYW5zZGFhbA==?=
Guest
Posts: n/a
 
      5th Jan 2005
I came across a reference on a web site
(http://www.personalmicrocosms.com/ht...htextbox_lines )
that said to speed up access to a rich text box's lines that you needed to
use a "foreach" loop instead of a "for" loop. This made absolutely no sense
to me, but the author had posted his code and timing results. The "foreach"
(a VB and other languages construct) was 0.01 seconds to access 1000 lines in
rich text box, whereas the "for" loop (a traditional C++ construct) was an
astounding 25 seconds (on a not very fast PC).

I recreated a test file using the partial source code posted by the author
and verified that there is a SIGNIFICANT performance difference between the
two constructs (although on my PC is was 0.01 seconds vs 3.6 seconds - still
a noticeable delay). Unfortunately, there was no explanation as to why this
was the case and I couldn't see anything as to why one loop construct would
be different. Looking at the generated IL code with Lutz Roeder's Reflector
tool, I see that the real culprit is not the loop structure but the
get_Lines() function that is pulled out of the loop in the "foreach" loop and
not in the "for" loop code. Which, leads to me post this question about the
differences in complier code generation/optimization and is there any setting
that can change this.

Interestingly, this is true for both Debug and Release builds. The compiler
generated code that called that function twice for each pass of the loop
(once for the loop index check and then again for the length calculation).
Pulling out unneccessary function calls is pretty basic optimization, and I
surprised that the compiler didn't detect this.

With the IDE's intellisense and auto completion features, the "for" loop
construct shown in the code below seems like something that someone might
actually code up, and of course who would have figured out that the get_Lines
method would be so performance intensive.

Makes me wonder if there are any other gotchas like this.

Thanks, Mike L.

--------------------------------------------------------------------------------------------

//Simple windows form with a richtextbox control, initialized w/1000 lines
of text (e.g., "line #101", etc).

private void ForLoopButton_Click(object sender, System.EventArgs e)
{
Cursor.Current = Cursors.WaitCursor;
int Len = 0;
int Start = Environment.TickCount;
for (int i = 0; i < TheRichTextBox.Lines.Length; i++)
{
Len += TheRichTextBox.Lines[i].Length;
}
int ElapsedTime = Environment.TickCount - Start;
ResultsTextBox.Clear();
RsultsTextBox.Text = "for loop\r\n\r\nElapsed time = " + ((double)
ElapsedTime / (double) 1000.0).ToString() + " seconds\r\n\r\nResult = " +
Len.ToString();
Cursor.Current = Cursors.Arrow;
}

private void ForEachLoopButton_Click(object sender, System.EventArgs e)
{
Cursor.Current = Cursors.WaitCursor;
int Len = 0;
int Start = Environment.TickCount;
foreach (String Line in TheRichTextBox.Lines)
{
Len += Line.Length;
}
int ElapsedTime = Environment.TickCount - Start;
ResultsTextBox.Clear();
ResultsTextBox.Text = "foreach loop\r\n\r\nElapsed time = " + ((double)
ElapsedTime / (double) 1000.0).ToString() + " seconds\r\n\r\nResult = " +
Len.ToString();
Cursor.Current = Cursors.Arrow;
}

private void ForLoopButton2_Click(object sender, System.EventArgs e)
{
//Performance results now same as ForEachLoopButton_Click with the changes
made.
Cursor.Current = Cursors.WaitCursor;
int Len = 0;
int Start = Environment.TickCount;
string[] lines = TheTextBox.Lines;
for (int i = 0; i < lines.Length; i++)
{
Len += lines[i].Length;
}
int ElapsedTime = Environment.TickCount - Start;
ResultsTextBox.Clear();
RsultsTextBox.Text = "for loop\r\n\r\nElapsed time = " + ((double)
ElapsedTime / (double) 1000.0).ToString() + " seconds\r\n\r\nResult = " +
Len.ToString();
Cursor.Current = Cursors.Arrow;
}

 
Reply With Quote
 
 
 
 
Fredrik Wahlgren
Guest
Posts: n/a
 
      5th Jan 2005

"Mike Lansdaal" <(E-Mail Removed)> wrote in message
news:700864C1-BF7A-4281-8E98-(E-Mail Removed)...
> I came across a reference on a web site
>

(http://www.personalmicrocosms.com/ht...htextbox_lines )
> that said to speed up access to a rich text box's lines that you needed to
> use a "foreach" loop instead of a "for" loop. This made absolutely no

sense
> to me, but the author had posted his code and timing results. The

"foreach"
> (a VB and other languages construct) was 0.01 seconds to access 1000 lines

in
> rich text box, whereas the "for" loop (a traditional C++ construct) was an
> astounding 25 seconds (on a not very fast PC).
>
> I recreated a test file using the partial source code posted by the author
> and verified that there is a SIGNIFICANT performance difference between

the
> two constructs (although on my PC is was 0.01 seconds vs 3.6 seconds -

still
> a noticeable delay). Unfortunately, there was no explanation as to why

this
> was the case and I couldn't see anything as to why one loop construct

would
> be different. Looking at the generated IL code with Lutz Roeder's

Reflector
> tool, I see that the real culprit is not the loop structure but the
> get_Lines() function that is pulled out of the loop in the "foreach" loop

and
> not in the "for" loop code. Which, leads to me post this question about

the
> differences in complier code generation/optimization and is there any

setting
> that can change this.
>
> Interestingly, this is true for both Debug and Release builds. The

compiler
> generated code that called that function twice for each pass of the loop
> (once for the loop index check and then again for the length calculation).
> Pulling out unneccessary function calls is pretty basic optimization, and

I
> surprised that the compiler didn't detect this.
>
> With the IDE's intellisense and auto completion features, the "for" loop
> construct shown in the code below seems like something that someone might
> actually code up, and of course who would have figured out that the

get_Lines
> method would be so performance intensive.
>
> Makes me wonder if there are any other gotchas like this.
>
> Thanks, Mike L.
>
> --------------------------------------------------------------------------

------------------
>
> //Simple windows form with a richtextbox control, initialized w/1000 lines
> of text (e.g., "line #101", etc).
>
> private void ForLoopButton_Click(object sender, System.EventArgs e)
> {
> Cursor.Current = Cursors.WaitCursor;
> int Len = 0;
> int Start = Environment.TickCount;
> for (int i = 0; i < TheRichTextBox.Lines.Length; i++)
> {
> Len += TheRichTextBox.Lines[i].Length;
> }
> int ElapsedTime = Environment.TickCount - Start;
> ResultsTextBox.Clear();
> RsultsTextBox.Text = "for loop\r\n\r\nElapsed time = " + ((double)
> ElapsedTime / (double) 1000.0).ToString() + " seconds\r\n\r\nResult = " +
> Len.ToString();
> Cursor.Current = Cursors.Arrow;
> }
>
> private void ForEachLoopButton_Click(object sender, System.EventArgs e)
> {
> Cursor.Current = Cursors.WaitCursor;
> int Len = 0;
> int Start = Environment.TickCount;
> foreach (String Line in TheRichTextBox.Lines)
> {
> Len += Line.Length;
> }
> int ElapsedTime = Environment.TickCount - Start;
> ResultsTextBox.Clear();
> ResultsTextBox.Text = "foreach loop\r\n\r\nElapsed time = " + ((double)
> ElapsedTime / (double) 1000.0).ToString() + " seconds\r\n\r\nResult = " +
> Len.ToString();
> Cursor.Current = Cursors.Arrow;
> }
>
> private void ForLoopButton2_Click(object sender, System.EventArgs e)
> {
> //Performance results now same as ForEachLoopButton_Click with the changes
> made.
> Cursor.Current = Cursors.WaitCursor;
> int Len = 0;
> int Start = Environment.TickCount;
> string[] lines = TheTextBox.Lines;
> for (int i = 0; i < lines.Length; i++)
> {
> Len += lines[i].Length;
> }
> int ElapsedTime = Environment.TickCount - Start;
> ResultsTextBox.Clear();
> RsultsTextBox.Text = "for loop\r\n\r\nElapsed time = " + ((double)
> ElapsedTime / (double) 1000.0).ToString() + " seconds\r\n\r\nResult = " +
> Len.ToString();
> Cursor.Current = Cursors.Arrow;
> }
>


Amazing! I had no idea. I sure hope someone is capable of explaining this.

/ Fredrik


 
Reply With Quote
 
Fredrik Wahlgren
Guest
Posts: n/a
 
      5th Jan 2005

Hi

I found something here that may explain this problem:
http://www.codeproject.com/csharp/foreach.asp

/ Fredrik


 
Reply With Quote
 
David Browne
Guest
Posts: n/a
 
      6th Jan 2005

"Mike Lansdaal" <(E-Mail Removed)> wrote in message
news:700864C1-BF7A-4281-8E98-(E-Mail Removed)...
>I came across a reference on a web site
> (http://www.personalmicrocosms.com/ht...htextbox_lines
> )
> that said to speed up access to a rich text box's lines that you needed to
> use a "foreach" loop instead of a "for" loop. This made absolutely no
> sense
> to me, but the author had posted his code and timing results. The
> "foreach"
> (a VB and other languages construct) was 0.01 seconds to access 1000 lines
> in
> rich text box, whereas the "for" loop (a traditional C++ construct) was an
> astounding 25 seconds (on a not very fast PC).
>
> I recreated a test file using the partial source code posted by the author
> and verified that there is a SIGNIFICANT performance difference between
> the
> two constructs (although on my PC is was 0.01 seconds vs 3.6 seconds -
> still
> a noticeable delay). Unfortunately, there was no explanation as to why
> this
> was the case and I couldn't see anything as to why one loop construct
> would
> be different. Looking at the generated IL code with Lutz Roeder's
> Reflector
> tool, I see that the real culprit is not the loop structure but the
> get_Lines() function that is pulled out of the loop in the "foreach" loop
> and
> not in the "for" loop code. Which, leads to me post this question about
> the
> differences in complier code generation/optimization and is there any
> setting
> that can change this.
>


Ah. It's not a compiler problem. It's a property problem.

get_Lines() is expensive. Who Knew? That's the problem with properties: you
never know how much code they run.

Anyway, try this:

string[] lines = TheRichTextBox.Lines;
for (int i = 0; i < lines.Length; i++)
{
Len += lines[i].Length;
}

It should be similar to the foreach case.

David


 
Reply With Quote
 
=?Utf-8?B?TWlrZSBMYW5zZGFhbA==?=
Guest
Posts: n/a
 
      6th Jan 2005
Frederik - Interesting article (which recommends to always use for instead of
foreach, but also generated opposing thoughts). I found this blog link in
the article comments
(http://blogs.msdn.com/brada/archive/...29/123105.aspx ) which suggests
that the code generation forthe two loop types are "bascially identical" and
that a "foreach" is recommended for "clarity".

Thanks, Mike

"Fredrik Wahlgren" wrote:

>
> Hi
>
> I found something here that may explain this problem:
> http://www.codeproject.com/csharp/foreach.asp
>
> / Fredrik
>
>
>

 
Reply With Quote
 
=?Utf-8?B?TWlrZSBMYW5zZGFhbA==?=
Guest
Posts: n/a
 
      6th Jan 2005
Yes, exactly. Thats what I did (and that's what the foreach does). My
concern was that in one case the compiler did one thing (pulled the property
call out of the loop) and in another case didn't (in the for loop case, its
there for the loop check and again for the calcuation).

Thanks, Mike

"David Browne" wrote:

>
> "Mike Lansdaal" <(E-Mail Removed)> wrote in message
> news:700864C1-BF7A-4281-8E98-(E-Mail Removed)...
> >I came across a reference on a web site
> > (http://www.personalmicrocosms.com/ht...htextbox_lines
> > )
> > that said to speed up access to a rich text box's lines that you needed to
> > use a "foreach" loop instead of a "for" loop. This made absolutely no
> > sense
> > to me, but the author had posted his code and timing results. The
> > "foreach"
> > (a VB and other languages construct) was 0.01 seconds to access 1000 lines
> > in
> > rich text box, whereas the "for" loop (a traditional C++ construct) was an
> > astounding 25 seconds (on a not very fast PC).
> >
> > I recreated a test file using the partial source code posted by the author
> > and verified that there is a SIGNIFICANT performance difference between
> > the
> > two constructs (although on my PC is was 0.01 seconds vs 3.6 seconds -
> > still
> > a noticeable delay). Unfortunately, there was no explanation as to why
> > this
> > was the case and I couldn't see anything as to why one loop construct
> > would
> > be different. Looking at the generated IL code with Lutz Roeder's
> > Reflector
> > tool, I see that the real culprit is not the loop structure but the
> > get_Lines() function that is pulled out of the loop in the "foreach" loop
> > and
> > not in the "for" loop code. Which, leads to me post this question about
> > the
> > differences in complier code generation/optimization and is there any
> > setting
> > that can change this.
> >

>
> Ah. It's not a compiler problem. It's a property problem.
>
> get_Lines() is expensive. Who Knew? That's the problem with properties: you
> never know how much code they run.
>
> Anyway, try this:
>
> string[] lines = TheRichTextBox.Lines;
> for (int i = 0; i < lines.Length; i++)
> {
> Len += lines[i].Length;
> }
>
> It should be similar to the foreach case.
>
> David
>
>
>

 
Reply With Quote
 
David Browne
Guest
Posts: n/a
 
      6th Jan 2005

"Mike Lansdaal" <(E-Mail Removed)> wrote in message
news:3EA2A082-787D-430D-9D48-(E-Mail Removed)...
> Yes, exactly. Thats what I did (and that's what the foreach does). My
> concern was that in one case the compiler did one thing (pulled the
> property
> call out of the loop) and in another case didn't (in the for loop case,
> its
> there for the loop check and again for the calcuation).
>


Well in the for loop it can't pull it out. For all the compiler knows
get_Lines() might start returning a completely different array half way
through the iteration.

In the foreach case, the compiler has more information. It knows that it's
iterating the result of get_Lines().


David


 
Reply With Quote
 
=?Utf-8?B?TWlrZSBMYW5zZGFhbA==?=
Guest
Posts: n/a
 
      6th Jan 2005
David - Thanks. I think I was assuming something about the context of the
iteration, but I see that with your explanation that it would be impossible
for the compiler to determine that.

Thanks, Mike

"David Browne" wrote:

>
> "Mike Lansdaal" <(E-Mail Removed)> wrote in message
> news:3EA2A082-787D-430D-9D48-(E-Mail Removed)...
> > Yes, exactly. Thats what I did (and that's what the foreach does). My
> > concern was that in one case the compiler did one thing (pulled the
> > property
> > call out of the loop) and in another case didn't (in the for loop case,
> > its
> > there for the loop check and again for the calcuation).
> >

>
> Well in the for loop it can't pull it out. For all the compiler knows
> get_Lines() might start returning a completely different array half way
> through the iteration.
>
> In the foreach case, the compiler has more information. It knows that it's
> iterating the result of get_Lines().
>
>
> David
>
>
>

 
Reply With Quote
 
Kevin Yu [MSFT]
Guest
Posts: n/a
 
      6th Jan 2005
Hi Mike,

Here is an official document from MSDN. I think it will be clearer after
checking this article.

http://msdn.microsoft.com/library/de...us/dnpag/html/
scalenetchapt05.asp

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."

 
Reply With Quote
 
Kevin Yu [MSFT]
Guest
Posts: n/a
 
      6th Jan 2005
Hi Mike,

Generally, a For loop has better performance than Foreach loop. In the link
you have provided in your first post, there are some differences between
For and Foreach loop which make the performance much differenct. For
example,

for (int i = 0; i < TheRichTextBox.Lines.Length; i++)

Since RichTextBox.Length returns length by caculating, so each time in the
loop, this property will be called. Also this calculating the length takes
a lot of time, since it needs to go through all the text in the
RichTextbox. If you change the code to the following, the return time will
tremendously decrease.

int a=TheRichTextBox.Lines.Length;
for (int i = 0; i < a; i++)

There are also many other differences in getting the line reference here.
So I don't think this tesing result is reliable. Please refer to the
official document as I provided in my last post.

HTH.

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."

 
Reply With Quote
 
 
 
Reply

Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
can I use a for loop or foreach here Tony Johansson Microsoft C# .NET 0 4th Apr 2009 06:40 PM
foreach-loop bug in CF2.0? c_xyTopa Microsoft Dot NET Compact Framework 9 24th Jan 2008 02:51 PM
Compiler error with foreach? Flinchvoid Microsoft C# .NET 11 2nd Feb 2005 07:37 PM
foreach loop Mike P Microsoft C# .NET 5 26th Nov 2004 01:19 PM
Will the C# compiler optimize this? Andreas Mueller Microsoft C# .NET 1 29th Mar 2004 05:34 PM


Features
 

Advertising
 

Newsgroups
 


All times are GMT +1. The time now is 10:28 AM.