Performance issues with multi-dimensional arrays

H

Henrik Schmid

Hi,

consider the attached code.

Serializing the multi-dimensional array takes about 36s
vs. 0.36s for the single-dimensional array.

Initializing the multi-dimensional array takes about 4s
vs. 0.3s for the single-dimensional array.
(I know initializing is not necessary in this simple example,
but in my application it was necessary to frequently
re-initialize an array)

Are there any workarounds other than using single-dimensional arrays,
store the array bounds in additional fields (in the actual code the
arrays are not always zero based) and do the index calculations in code?

TIA,
Henrik



byte[, ,] multiDimArray = new byte[1000, 1000, 64];
byte[] singleDimArray = new byte[64000000];

DateTime start = DateTime.Now;
using (Stream stream = File.Open(@"d:\test.tst", FileMode.OpenOrCreate))
{
BinaryFormatter formatter = new BinaryFormatter();
formatter.Serialize(stream, multiDimArray);
}
Console.WriteLine("Serialize multi dim " + (DateTime.Now -
start).TotalSeconds);

start = DateTime.Now;
using (Stream stream = File.Open(@"d:\test.tst", FileMode.OpenOrCreate))
{
BinaryFormatter formatter = new BinaryFormatter();
formatter.Serialize(stream, singleDimArray);
}
Console.WriteLine("Serialize single dim " + (DateTime.Now -
start).TotalSeconds);

start = DateTime.Now;
for (int i = multiDimArray.GetLowerBound(0); i <=
multiDimArray.GetUpperBound(0); ++i)
for (int j = multiDimArray.GetLowerBound(1); j <=
multiDimArray.GetUpperBound(1); ++j)
for (int k = multiDimArray.GetLowerBound(2); k <=
multiDimArray.GetUpperBound(2); ++k)
multiDimArray[i, j, k] = 0;
Console.WriteLine("Init multi dim " + (DateTime.Now - start).TotalSeconds);

start = DateTime.Now;
for (int i = 0; i < singleDimArray.Length; ++i)
singleDimArray = 0;
Console.WriteLine("Init single dim " + (DateTime.Now - start).TotalSeconds);
 
L

Lee

Henrik,

I'm not sure about the serialization itself but I noticed that you are
always doing the muti array first. Remember that spinning up objects
take time. When I ran your code, I got similar numbers, but when I
switched the call around to run the single array serialization first I
got these numbers:

Serialize single dim 3.1087183
Serialize multi dim 18.9803655

True, the multi is still higher, but not 10x higher.

Also, in your code to initialize the arrays, you make repeated calls
to GetLowerBound and GetUpperBound. These take time, lots of time. I
changed your code to store them off as temporary variables first:

for (int i = a; i <= x; ++i)
for (int j = b; j <= y; ++j)
for (int k = c; k <= z; ++k)
multiDimArray[i, j, k] = 0;

Console.WriteLine("Init multi dim " + (DateTime.Now -
start).TotalSeconds);

And I got this for the times:

Init multi dim 0.5155161
Init single dim 0.8904369

Now the multi is faster.

Hope any of this helps,

L. Lee Saunders
http://oldschooldotnet.blogspot.com
 
H

Henrik Schmid

Thank you, Lee.

You are right, storing the array bounds in temporaries
does speed things up in the multi dim case.
I didn't think of this, because for single dim it actually
hurts performance.
By storing the bounds in variables, I could improve the performance
of my application.

There is still the issue with (de)serialization.
And the difference is more like 100x, not 10x.
My little test program may not be optimal, but I don't see
much difference when serializing the single dim array first.

Anyway, I worked around this by changing this huge array to
one-dimensional and this dramatically reduced the time for
opening and saving documents in the application.

Thanks again.

Henrik
 
F

Frans Bouma [C# MVP]

Henrik said:
Hi,

consider the attached code.

Serializing the multi-dimensional array takes about 36s
vs. 0.36s for the single-dimensional array.

Initializing the multi-dimensional array takes about 4s
vs. 0.3s for the single-dimensional array.
(I know initializing is not necessary in this simple example,
but in my application it was necessary to frequently
re-initialize an array)

Are there any workarounds other than using single-dimensional arrays,
store the array bounds in additional fields (in the actual code the
arrays are not always zero based) and do the index calculations in code?

TIA,
Henrik



byte[, ,] multiDimArray = new byte[1000, 1000, 64];

You can also use byte[][][], which is called a 'jagged' array. A jagged
array is much faster. it requires a bit of different code to work with
them, but that won't be rocket science. See:
http://dotnetperls.com/Content/Jagged-Array.aspx

FB


--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------
 
H

Henrik Schmid

Hi,

thanks for the reply.

Actually, the first thing I tried was a "semi-jagged" array: byte[,][]
which was even slower.
No I tried byte[][][], which is a bit faster (factor 4) than multi dim,
but still slower than single dim (factor 20).

Given that in my real application the first two dimensions are not zero-based,
I would still have to do some index calculation, so I can as well use a
single dim array and have the full performance.

Maybe some future framework or compiler version can apply similar
optimizations
to multi dim arrays.

Thanks anyway.

Henrik

Frans Bouma said:
byte[, ,] multiDimArray = new byte[1000, 1000, 64];

You can also use byte[][][], which is called a 'jagged' array. A jagged
array is much faster. it requires a bit of different code to work with
them, but that won't be rocket science. See:
http://dotnetperls.com/Content/Jagged-Array.aspx

FB
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top