process speed

Tony Johansson · Jan 20, 2010

Hello!

I just wonder what is the difference in processing speed between using an
array of ints and an array of byte if the array
is very large.

//Tony

Jeff Johnson · Jan 20, 2010

I just wonder what is the difference in processing speed between using an
array of ints and an array of byte if the array
is very large.

When processing WHAT?

Tony Johansson · Jan 20, 2010

Jeff Johnson said:
When processing WHAT?

For example looping through the array and summing all the values in each
index.

//Tony

Family Tree Mike · Jan 20, 2010

Tony Johansson said:
For example looping through the array and summing all the values in each
index.

//Tony

.

The looping should not matter, and the summing would rarely be contained in
a byte result. That is, the sum of many bytes likely exceeds a byte.
Therefore this would likely involve casting to ints, which would actually
take longer.

Mike

Peter Duniho · Jan 20, 2010

Family said:
Tony Johansson said:

[...]
For example looping through the array and summing all the values in each
index.

Click to expand...

The looping should not matter, and the summing would rarely be contained in
a byte result. That is, the sum of many bytes likely exceeds a byte.
Therefore this would likely involve casting to ints, which would actually
take longer.

In native machine code? Probably not. Reading a single byte into a
register versus reading a 32-bit int into a register is basically the
same. Because of the static typing in C# and the specific types
involved, there's no real casting involved (i.e. the run-time doesn't
have to do a checkâ€¦the data is just copied from one place to another).

IMHO, the correct answer is "who cares?"

The next correct answer is
"the only correct way to know is to write both and measure the
performance of each".

But if we're going to speculate, my expectation is that caching and
virtual memory effects will swamp any other performance consideration.
And in that case, given N array elements, an array of bytes will take
75% less room, and thus require 75% fewer cache misses and page faults.
The byte array _could_ in fact perform better.

In reality, unless we're talking about data that naturally fits in a
byte array, is for some reason extremely large, and will be processed on
a 32-bit system â€“ and so there are non-performance reasons to go with
the byte[] versus uint[] or int[] â€“ the code ought to just be written
with uint[] or int[], until such time as it's been proven that there is
some performance bottleneck that can be addressed by using byte[].

Pete

Tony Johansson · Jan 20, 2010

Peter Duniho said:
Family said:

Tony Johansson said:

[...]
For example looping through the array and summing all the values in each
index.

Click to expand...

The looping should not matter, and the summing would rarely be contained
in a byte result. That is, the sum of many bytes likely exceeds a byte.
Therefore this would likely involve casting to ints, which would actually
take longer.

Click to expand...

In native machine code? Probably not. Reading a single byte into a
register versus reading a 32-bit int into a register is basically the
same. Because of the static typing in C# and the specific types involved,
there's no real casting involved (i.e. the run-time doesn't have to do a
check.the data is just copied from one place to another).

IMHO, the correct answer is "who cares?" The next correct answer is
"the only correct way to know is to write both and measure the performance
of each".

But if we're going to speculate, my expectation is that caching and
virtual memory effects will swamp any other performance consideration. And
in that case, given N array elements, an array of bytes will take 75% less
room, and thus require 75% fewer cache misses and page faults. The byte
array _could_ in fact perform better.

In reality, unless we're talking about data that naturally fits in a byte
array, is for some reason extremely large, and will be processed on a
32-bit system - and so there are non-performance reasons to go with the
byte[] versus uint[] or int[] - the code ought to just be written with
uint[] or int[], until such time as it's been proven that there is some
performance bottleneck that can be addressed by using byte[].

Pete

Good explained

//Tony

Arne Vajhøj · Jan 21, 2010

I just wonder what is the difference in processing speed between using an
array of ints and an array of byte if the array
is very large.

Generally it depends on the system, the .NET version and the code.

The operations on int is typical faster than operations
on byte.

But the int array is larger than the byte array and therefore
require more data to be moved.

It would be a good guess that:
* int is faster than byte if the array does fit in L2 cache
* byte is faster than int if the array does not fit in L2 cache

You could test it yourself in your context (your system,
your .NET version and your code).

But my suggestion will be: don't.

Reason:
- it is very unlikely that a real world app would have this
problem as the performance determining factor
- it is very likely that a real world app would run on a
lot of different systems and .NET version the next 10-20 years
so I consider the result of a test useless.

Arne

kndg · Mar 24, 2010

I have come to believe that array processing in C# is just plain slow
because of bounds checking, along with other factors. If you take 2
different algorithms, say bubble sort, and prime number generation and
write them is C/C++ and then C#, you wil lsee that the C/C++
implementation of bubble sort is roughly 3x faster than the C# version.
However, if you take a compute intensive algorithm lie prime number
generation, the C++ version is now only less than 1x slower than the C++
version.

Look at the following for more details:

http://www.cherrystonesoftware.com/doc/AlgorithmicPerformance.pdf

Hi Jim,

Array bound checking is a very good feature which improve safety and
avoid the nasty buffer overrun bugs found in old c/c++ programs and I
don't think it would give a significant impact on the performance. The
article you mentioned above also quite suspicious. I don't think a C#
program would perform very badly compared to Java. So, I did a quick
test. I haven't read the whole article, so I pick the simplest algorithm
- BubbleSort. On my system 2GHz, 2GB laptop (which is under spec
compared to their machine), I get the following result:

C#: 65 seconds
Java: 130 seconds

Which is quite the reverse and nearly match the c/c++ performance.
Probably, there is something wrong with their implementation...

Regards.

process speed

Tony Johansson

Jeff Johnson

Tony Johansson

Family Tree Mike

Peter Duniho

Tony Johansson

Arne Vajhøj

kndg