Get unicode character by bit shifting

  • Thread starter Thread starter tcomer
  • Start date Start date
T

tcomer

Hello,

I have a pretty interesting problem here..

Ok, I have an integer that needs to be right shifted, and then
converted to a char.. which is then is used to build a string. Heres
an example of the steps:

long Value = 1667688574;

char c1 = Value >> 24; // produces '99' or 'c'
char c2 = Value >> 16; // produces '25446' or '’Í'
char c3 = Value >> 8; // generates OverflowException "Value was
either too large or too small for a character."
char c4 = Value >> 0; // never gets executed due to the error above

The error gets thrown using long, ulong, etc..
The same scenario in php produces 'c' 'f' '¨¨' and '~', respectively.
Any ideas as to what I may be doing wrong?
 
I also forgot to mention that my goal is to duplicate the php scenario
that I mentioned above. The desired output is 'c' 'f' 'è' and '~'
 
I also forgot to mention that my goal is to duplicate the php scenario
that I mentioned above. The desired output is 'c' 'f' 'è' and '~'

Shifting down bits leaves the "upper" so Value >> 24 will be correct..
te other u just ger more than u want if not masking with & FF
//CY
 
Shifting down bits leaves the "upper" so Value >> 24 will be correct..
te other u just ger more than u want if not masking with & FF

The >> 24 will also fail if the high bit is set in Value (if it
is negative).

Arne
 
tcomer said:
Ok, I have an integer that needs to be right shifted, and then
converted to a char.. which is then is used to build a string. Heres
an example of the steps:

long Value = 1667688574;

char c1 = Value >> 24; // produces '99' or 'c'
char c2 = Value >> 16; // produces '25446' or 'æ¦'
char c3 = Value >> 8; // generates OverflowException "Value was
either too large or too small for a character."
char c4 = Value >> 0; // never gets executed due to the error above

The error gets thrown using long, ulong, etc..
The same scenario in php produces 'c' 'f' 'è' and '~', respectively.

You have a long with a value that coul have been in an int.

Besides the correct suggestion to use & 0xFF, then you may
consider redesigning the logic a bit.

If you actually wants a text string with first byte as
first char, then you can use:

Encoding.Default.GetString(BitConverter.GetBytes(Value))

Arne
 
The >> 24 will also fail if the high bit is set in Value (if it
is negative).

Arne

and long Value = 1667688574; looks negative?
ok, shifting it leaves 63(H), and setting the negative.. 163 and then
& FF and 63 again... *doh*
//CY
 
long Value = 1667688574;
....
Any ideas as to what I may be doing wrong?

What are you really trying to do?
Unicode characters go beyond 0xFF, taking more than one byte.
So, even if you & 0xFF as sugested, you risk damaging some
international stuff.
If you try some kind of encryption, then you probably
better go with a strong, standard thing.
 
Back
Top