The following code simply creates a byte array from a HexString.
Can anyone improve the perfomance of this method?
I was bored and decided to do some general experimenting with this
problem. From the beginning I am assumed that the hexstring consists
only of numbers and capital letters. All algorithms expects wellformed
data and does not guarantee nice error handling. Some of the
algorithms could easily be transformed to handle more complex input
data though.
Before I continue I would also like to mention that when implementing
an optimized algorithm in production code it is recommended that you
have test cases that compares the result with a non optimized version
to make sure that no mistakes is made. Premature optimization is also
never a good idea.
Here is a list sorted by performance from worst to best (in seconds
for a number of iterations)
Original 528
Boolean 9,0
Inlined Boolean 8,9
Look-Up Table 7,9
If-Else 7,1
Inlined If-Else 6,3
Unsafe Boolean 5,4
Following is an overview of all the different algorithms:
1. Look-Up Table
This was the first program I came up with. It uses a simple
pre-calculated table to look up the value of any input character. As
expected it is pretty fast and it can also easily be extended to
support 'a'-'f' without performance loss.
Byte[] lookup = new byte[256];
byte counter = 0;
for(char c='0';c<='9';c++)
{ lookup[c] = counter;counter++;}
counter = 10;
for(char c='A';c<='F';c++)
{ lookup[c] = counter;counter++;}
for(int i=0;i<hexString.Length;i+=2)
Bytes[i>>1] = (byte)((lookup[hexString
] << 4) +
lookup[hexString[i+1]]);
2. If-Else
Instead of using a look-up table the above program could easily be
modified by calling a method that calculates the correct value on the
spot. I was suprised when I noticed that this actually was faster than
looking up the value. The only disadvantage is that it suffers a large
performance hit, probably to the il not inlining, when extending to
handle 'a'-'f'
for(int i=0;i<hexString.Length;i+=2)
Bytes[i>>1] = (byte)((GetVal(hexString) << 4) +
GetVal(hexString[i+1]));
static int GetVal(char c){
if(c <= '9') return c - '0';
else return c - 'A' + 10;
}
3. Inlined If-Else
This is the same as the above program except that instead of calling a
function I calculate the value inside the for loop. This seems to
yield a slight performance increase. It is fast and versitile. It is
less readable than look-up table or ordinary if-else.
4. Boolean Logic
Avoiding jumps in the code can sometimes speed up program execution so
I decided to write a simple boolean version of GetVal. The efficency
was slightly lower than the other methods.
static int GetVal(char c){
return (c & 15) + ((c & 64) >> 3) + ((c & 64) >> 6);
}
5. Inlined Boolean Logic
Inlining boolean logic didn't seem to help. The IL compiler inline
seems to be working smothly already.
6. Unsafe Boolean Logic
In a final attempt to pull the most out of c# I decided to do an
unsafe version of boolean logic that accessed two letters at a time.
This proved to be the most efficent program, although not that
readable. Do not use unless maximum performance is needed.
fixed(char* c = hexString){
int* num = (int*) c;
for(int i=0;i<hexString.Length>>1;i++){
int t = (num & 983055) + ((num & 4194368) >> 3) + ((num &
4194368) >> 6);
t = t & 983055;
Bytes = (byte)((t << 4) | (t>>16));
}
}