Conversion signed/unsigned

  • Thread starter Thread starter yves
  • Start date Start date
Y

yves

Hello,

I'm trying to check if only the lsb 8 bits of an integer are used.

bool Func(int x)
{
if ((x & 0xFFFFFF00) == 0)
{
return true;
}
else
{
return false;
}
}

This actually comes down to checking if an integer is in the range
[0,255]. Alternatively I could do:

bool Func(int x)
{
if (x >= 0 && x <= 255)
{
return true;
}
else
{
return false;
}
}

Since the first function only contains one conditional jump and the
second one contains two, the first one should be faster. Oddly enough
when doing some performance tests I found out that the first one (with
only one condition/jump) is doing worse than the second one (with two
conditions/jumps). I've looked at the MSIL to try to figure out why and
found out that the first function is actually translated as:

ldarg.0
conv.i8
ldc.i4 0xffffff00
conv.u8
and
ldc.i4.0
conv.i8
bne.un.s ...

This code contains a number of 'conv' instructions which are causing
the overhead. So I'm wondering which value should be used so that only
4 bytes are used and the 24 MSB are set to 1 while the others are 0?
0xffffff00 is seen as an unsigned value while I want it to be seen as a
signed one of 4 bytes. Any ideas?

TIA

Yves
 
This code contains a number of 'conv' instructions which are causing
the overhead. So I'm wondering which value should be used so that only
4 bytes are used and the 24 MSB are set to 1 while the others are 0?
0xffffff00 is seen as an unsigned value while I want it to be seen as a
signed one of 4 bytes. Any ideas?

You claim it's the "conv" instruction which is causing the overhead.
Have you actually examined the optimised, JITted assembly code which is
produced? Looking at the IL in isolation isn't terrible useful.
 
I replaced 0xFFFFFF00 with -256 which is its decimal equivalent when
using signed 32 bit integers. When I compile that code, all 'conv'
instructions are gone and the result actually runs faster as one would
have suspected.

So for some reason 0xFFFFFF00 gets compiled to:
ldc.i4 0xffffff00
conv.u8

while -256 gets compiled to:
ldc.i4 0xffffff00

One implication of using 0xFFFFFF00 is that all other 'parties' in the
equation need to be represented by 8 bytes as well, meaning that more
'conv' instructions will be inserted into the code. There is nothing
the JITter can do about them.

Doing some further testing I noticed that "int y = 0xFFFF0000;"
generates a conversion error from unsigned to signed ints. The error
gets only thrown if you use a hex value which is high enough to
'corrupt' the sign bit. So if your hex value is represented by less
than 32 bits (<= 0x7FFFFFFF) it's considered signed while if it uses
all 32 bits (> 0x7FFFFFFF) it's considered unsigned. I guess there is
some logic to it but I don't really see it.

Yves
 
Alternately,
bool Func(int x)
{
if ((x & 255) == x)
{
return true;
}
else
{
return false;
}
}

or even
bool Func(int x)
{
return ((x & 255) == x);
}

Hello,

I'm trying to check if only the lsb 8 bits of an integer are used.

bool Func(int x)
{
if ((x & 0xFFFFFF00) == 0)
{
return true;
}
else
{
return false;
}
}

This actually comes down to checking if an integer is in the range
[0,255]. Alternatively I could do:

bool Func(int x)
{
if (x >= 0 && x <= 255)
{
return true;
}
else
{
return false;
}
}

Since the first function only contains one conditional jump and the
second one contains two, the first one should be faster. Oddly enough
when doing some performance tests I found out that the first one (with
only one condition/jump) is doing worse than the second one (with two
conditions/jumps). I've looked at the MSIL to try to figure out why
and found out that the first function is actually translated as:

ldarg.0
conv.i8
ldc.i4 0xffffff00
conv.u8
and
ldc.i4.0
conv.i8
bne.un.s ...

This code contains a number of 'conv' instructions which are causing
the overhead. So I'm wondering which value should be used so that only
4 bytes are used and the 24 MSB are set to 1 while the others are 0?
0xffffff00 is seen as an unsigned value while I want it to be seen as
a signed one of 4 bytes. Any ideas?

TIA

Yves
 
yves said:
I replaced 0xFFFFFF00 with -256 which is its decimal equivalent when
using signed 32 bit integers. When I compile that code, all 'conv'
instructions are gone and the result actually runs faster as one would
have suspected.

But that doesn't mean it's the "conv" that's taking the time. I suspect
the time is really being taken by then performing a 64-bit "AND"
instead of a 32-bit one.
So for some reason 0xFFFFFF00 gets compiled to:
ldc.i4 0xffffff00
conv.u8

while -256 gets compiled to:
ldc.i4 0xffffff00

One implication of using 0xFFFFFF00 is that all other 'parties' in the
equation need to be represented by 8 bytes as well, meaning that more
'conv' instructions will be inserted into the code. There is nothing
the JITter can do about them.

Doing some further testing I noticed that "int y = 0xFFFF0000;"
generates a conversion error from unsigned to signed ints. The error
gets only thrown if you use a hex value which is high enough to
'corrupt' the sign bit. So if your hex value is represented by less
than 32 bits (<= 0x7FFFFFFF) it's considered signed while if it uses
all 32 bits (> 0x7FFFFFFF) it's considered unsigned. I guess there is
some logic to it but I don't really see it.

It should only generate that in a "checked" context. Try using:
int y = unchecked((int)0xffffff00);

and it should be fine.

If you change your original method to:

bool Func(int x)
{
if ((x & -0x100) == 0)
{
return true;
}
else
{
return false;
}
}

you'll find there's no 64-bit conversions going on.

The method is also a lot shorter in IL (as well as C#) if you change it
to:

bool Func(int x)
{
return ( (x & -0x100) == 0);
}

I haven't tested the JITted speed of that, but it's worth trying.
 
yves said:
Hello,

I'm trying to check if only the lsb 8 bits of an integer are used.

bool Func(int x)
{
if ((x & 0xFFFFFF00) == 0)
{
return true;
}
else
{
return false;
}
}

This actually comes down to checking if an integer is in the range
[0,255]. Alternatively I could do:

bool Func(int x)
{
if (x >= 0 && x <= 255)
{
return true;
}
else
{
return false;
}
}

why don't you do:
bool Func(int x)
{
return ((uint)x <= 255);
}

?

FB

--
 
| Hello,
|
| I'm trying to check if only the lsb 8 bits of an integer are used.
|
| bool Func(int x)
| {
| if ((x & 0xFFFFFF00) == 0)
| {
| return true;
| }
| else
| {
| return false;
| }
| }
|
| This actually comes down to checking if an integer is in the range
| [0,255]. Alternatively I could do:
|
| bool Func(int x)
| {
| if (x >= 0 && x <= 255)
| {
| return true;
| }
| else
| {
| return false;
| }
| }
|
| Since the first function only contains one conditional jump and the
| second one contains two, the first one should be faster. Oddly enough
| when doing some performance tests I found out that the first one (with
| only one condition/jump) is doing worse than the second one (with two
| conditions/jumps). I've looked at the MSIL to try to figure out why and
| found out that the first function is actually translated as:
|
| ldarg.0
| conv.i8
| ldc.i4 0xffffff00
| conv.u8
| and
| ldc.i4.0
| conv.i8
| bne.un.s ...
|
| This code contains a number of 'conv' instructions which are causing
| the overhead. So I'm wondering which value should be used so that only
| 4 bytes are used and the 24 MSB are set to 1 while the others are 0?
| 0xffffff00 is seen as an unsigned value while I want it to be seen as a
| signed one of 4 bytes. Any ideas?
|
| TIA
|
| Yves
|

Note that the IL doesn't reflect what's really done by the JIT, the JIT
compiler is platform dependent (X86/X64/IA), and knows exactly what it
should do depending on the underlying platform.
For instance the conv instructions are bogus, they aren't translated on X86.
What's really going on can be seen in X86 code generated by the JIT.
Both of the following should generate identical X86 code (like case2), but
it's not....

if ((x & 0xFFFFFF00) == 0)
if ((x & -256) == 0)

Case1
if ((x & 0xFFFFFF00) == 0)


native (JIT compile) code.
8bf1 mov esi,ecx
33ff xor edi,edi
8bc6 mov eax,esi
c1f81f sar eax,0x1f
f7c600ffffff test esi,0xffffff00
0f95c0 setne al
0fb6c0 movzx eax,al
85c0 test eax,eax
7505 jnz 00cb00e5
bf01000000 mov edi,0x1
8bc7 mov eax,edi
5e pop esi
5f pop edi
c3 ret

Here the value x is "sign tested", this is done by shifted to the right (sar
eax,31) through the CF flag, if x is <0 then CF=true else it's false, but
the CF is not tested further in the code, so this (mov and sar) is bogus,
while Case2....

if ((x & -256) == 0)

8bf1 mov esi,ecx
33ff xor edi,edi
f7c600ffffff test esi,0xffffff00
0f95c0 setne al
0fb6c0 movzx eax,al
85c0 test eax,eax
7505 jnz 00cb0120
bf01000000 mov edi,0x1
8bc7 mov eax,edi
5e pop esi
5f pop edi
c3 ret

is more efficient as the "sign test" is not done, why this is done? Well,
this is something you should ask the JIT team ;-)
Anyway, you see there is no conversion done and no 64 bit operations
performed (note that this is case on X64).

Willy.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top