Format Numbers To Strings.. Perfomance?

  • Thread starter Nico VanHaaster
  • Start date
N

Nico VanHaaster

Hello,

I don't really want to get too much of a debate here, but I have a
feeling this just might cause one. I have been trying to increase the
performance of one of my applications. (C# .Net web app).

Here's the scenario.

I fill a dataAdapter with up to 64 columns (but as low as 10) of data,
with +-300 rows. 63 of the 64 columns are all numbers, doubles, &
int's.

Now I then have to loop through the dataAdapter and build a table out
of the data.

The whole process takes the web app about 1.5 minutes to complete. Now
comes my question.

What is the fastest (best performing) way of converting a string to a
percent, double, int then back to a string to display on the page.

Also. What sort of performance implications come from statements such
as Convert.ToDouble(MyDataRow[my column
index].ToString()).ToString("P2");

This seems like pretty basic stuff however I can't find too many
resources out there that deal with these types of problems.

Thank you in advance.
 
N

Nicholas Paldino [.NET/C# MVP]

Nico,

If you have a specific format that you are converting to/from, then I
imagine that you could write your own specialized routine. However, I don't
know that you are going to do much better than using Parse/TryParse with a
specific format.

In the end, this is easy enough to test, using the Stopwatch class in
the System.Diagnostics namespace.

I find it to be a little strange that it would take 1.5 minutes to
process 300 rows with 64 columns. I wouldn't think to look at the
formatting of the strings as the culprit. Is this what you think? My first
guess would be whatever query/stored procedure you are calling that is
getting the data. If I was looking to enhance the performance of this
operation, that is the first place I would look.
 
A

Arne Vajhøj

Nico said:
I don't really want to get too much of a debate here, but I have a
feeling this just might cause one. I have been trying to increase the
performance of one of my applications. (C# .Net web app).

Here's the scenario.

I fill a dataAdapter with up to 64 columns (but as low as 10) of data,
with +-300 rows. 63 of the 64 columns are all numbers, doubles, &
int's.

Now I then have to loop through the dataAdapter and build a table out
of the data.

The whole process takes the web app about 1.5 minutes to complete. Now
comes my question.

What is the fastest (best performing) way of converting a string to a
percent, double, int then back to a string to display on the page.

Also. What sort of performance implications come from statements such
as Convert.ToDouble(MyDataRow[my column
index].ToString()).ToString("P2");

This seems like pretty basic stuff however I can't find too many
resources out there that deal with these types of problems.

You are looking at the wrong place.

I did a little test and I can convert int->string->int 1.3 millions
times per second on a 3 year old PC.

64 * 300 conversions should be done in 1/100 of a second.

Doubles and decimals will take a bit more time.

But it is not the conversions that are taking your 1.5 minutes.

I think you should start by measuring with a profiler.

Arne
 
J

Jon Skeet [C# MVP]

Here's the scenario.

I fill a dataAdapter with up to 64 columns (but as low as 10) of data,
with +-300 rows. 63 of the 64 columns are all numbers, doubles, &
int's.

The whole process takes the web app about 1.5 minutes to complete. Now
comes my question.

What is the fastest (best performing) way of converting a string to a
percent, double, int then back to a string to display on the page.

It's going to be irrelevant. Let's keep the numbers simple - let's
call it 1000 rows of 100 numbers. Here's a quick test app to generate
all the numbers, then start a timer, convert them to string, parse
them, then format them again:

using System;
using System.Diagnostics;

class Test
{
static void Main()
{
Random rng = new Random();
double[] data = new double[100*1000];
for (int i=0; i < data.Length; i++)
{
data = rng.NextDouble();
}
Stopwatch sw = Stopwatch.StartNew();
int totalLength = 0;
foreach (double datum in data)
{
string simpleFormat = datum.ToString();
double reparsed = double.Parse(simpleFormat);
string reformatted = reparsed.ToString("P2");
// Make absolutely sure the JIT isn't optimising it all away
totalLength += reformatted.Length;
}
sw.Stop();
Console.WriteLine("Time: {0}ms", sw.ElapsedMilliseconds);
Console.WriteLine("Length: {0}", totalLength);
}
}

Now, I'm currently writing this post on an Eee PC (701G). It's running
at 630MHz - hardly a speedy beast. The above (which deals with more
than 5 times as much data as you mentioned earlier) takes about 650ms
to execute. In other words, if the whole process took 1.5 minutes on
this machine, the parsing and reformatting would take less that 1% of
the total.

Basically, don't worry about it from a performance point of view. From
a readability point of view, however, it would be good to avoid doing
lots of extra parsing etc. If you can just cast to an appropriate data
type from your data set (or use a strongly-typed data set to start
with) that would make your code cleaner. Depending on the data, you
might also want to consider using decimal instead of double.

Jon
 
N

Nico VanHaaster

Great points guys and you were absolutly correct.

I was using the stop watch feature however was measuring the time
incorrectly. I was doing two things wrong 1) Not converting my
StopWatch.Elapsed to a TimeSpan. 2) When converting to a TimeSpan i
used objTS.Seconds not objTS.TotalSeconds.

The problem is in my selection from the SQL Server which in my mind
makes sense.

Thank you for your help. If anything i know that i can not increase my
performance too much by modifying the way i am converting my
dataTypes.

Jon,
Would you mean by strongly typed datasets as.
DataSet objDS = new DataSet();
DataColumn dc = new DataColumn("x",Type.GetType("System.Double")); //
diferent for each column"
objDS.Columns.Add(dc);
SqlDataAdapter.Fill(objDS);

Thank you again.
 
H

Hans Kesting

Nico VanHaaster laid this down on his screen :
Also. What sort of performance implications come from statements such
as Convert.ToDouble(MyDataRow[my column
index].ToString()).ToString("P2");

Just a minor point (as the others pointed out, this will not be your
bottleneck): what columntype does MyDataRow[my column index] have?
If it already *is* a double, then you can just cast it to double,
instead of ToString() followed by Convert.ToDouble (which will do a
Parse).

So use
((double)MyDataRow[my column index]).ToString("P2");

Hans Kesting
 
J

Jon Skeet [C# MVP]

Great points guys and you were absolutly correct.

I was using the stop watch feature however was measuring the time
incorrectly. I was doing two things wrong 1) Not converting my
StopWatch.Elapsed to a TimeSpan. 2) When converting to a TimeSpan i
used objTS.Seconds not objTS.TotalSeconds.

Stopwatch.ElapsedMilliseconds is another way of getting the result
pretty easily.
The problem is in my selection from the SQL Server which in my mind
makes sense.

Yes - when there's a database involved, local computation is rarely
(but sometimes) the bottleneck.

Would you mean by strongly typed datasets as.
DataSet objDS = new DataSet();
DataColumn dc = new DataColumn("x",Type.GetType("System.Double")); //
diferent for each column"
objDS.Columns.Add(dc);
SqlDataAdapter.Fill(objDS);

Thank you again.

Not quite - I mean going through the designer which creates a strongly
typed dataset for you. MSDN has a lot of information on the topic,
IIRC - I can't say I've used datasets of either description very much.

Jon
 
M

Mythran

Nico VanHaaster said:
Great points guys and you were absolutly correct.

I was using the stop watch feature however was measuring the time
incorrectly. I was doing two things wrong 1) Not converting my
StopWatch.Elapsed to a TimeSpan. 2) When converting to a TimeSpan i
used objTS.Seconds not objTS.TotalSeconds.

The problem is in my selection from the SQL Server which in my mind
makes sense.

Thank you for your help. If anything i know that i can not increase my
performance too much by modifying the way i am converting my
dataTypes.

Jon,
Would you mean by strongly typed datasets as.
DataSet objDS = new DataSet();
DataColumn dc = new DataColumn("x",Type.GetType("System.Double")); //
diferent for each column"
objDS.Columns.Add(dc);
SqlDataAdapter.Fill(objDS);

Thank you again.


A strongly typed dataset is not what you have above. Check out the MSDN
documentation on typed datasets, or google if you prefer. As for creating
columns as above, even though it's not strongly typed datasets, you should
still use:

DataColumn dc = new DataColumn("x", typeof(double));

Notice the typeof call instead of Type.GetType.

HTH,
Mythran
 
M

Mythran

Nico VanHaaster said:

Also. What sort of performance implications come from statements such
as Convert.ToDouble(MyDataRow[my column
index].ToString()).ToString("P2");

This seems like pretty basic stuff however I can't find too many
resources out there that deal with these types of problems.

Thank you in advance.

Instead of:

Convert.ToDouble(MyDataRow[index].ToString()).ToString("P2")

If the column at the specified index is stored as a double, just cast it
directly:

((double) MyDataRow[index]).ToString("P2")

This is a direct cast instead of a conversion. Should be faster, but I have
no definitive tests to show to prove it.

HTH,
Mythran
 
M

Mythran

Mythran said:
Nico VanHaaster said:

Also. What sort of performance implications come from statements such
as Convert.ToDouble(MyDataRow[my column
index].ToString()).ToString("P2");

This seems like pretty basic stuff however I can't find too many
resources out there that deal with these types of problems.

Thank you in advance.

Instead of:

Convert.ToDouble(MyDataRow[index].ToString()).ToString("P2")

If the column at the specified index is stored as a double, just cast it
directly:

((double) MyDataRow[index]).ToString("P2")

This is a direct cast instead of a conversion. Should be faster, but I
have no definitive tests to show to prove it.

HTH,
Mythran

BTW, the results of my posts are not a "significant" improvement in
performance. Over a period of a million rows, you "may" see a 1 or 2 second
difference in processing time...if the conversions and to strings are all
that you are doing in the loops...but since you are only converting a couple
thousand values in the loop, you probably won't see much of a
difference....just a note since I didn't mention it in my last replies.

HTH,
Mythran
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top