Odd string encoding behaviour

Miki Watts · Feb 21, 2004

I'm having a problem with encoding a string... here's my code:

byte[] s = System.Text.Encoding.ASCII.GetBytes(FieldContent);

Now, this works fine, as long as there are no bytes that are over 128, for
example, a 0x99 byte turns out as 0x3f byte.
I know that ASCII is just 7 bits, but i tried the other encoding formats,
and they didn't get me what i needed... UTF7 did the same thing as ASCII,
UTF8 gave 0xC299 for each 0x99 byte, and UNICODE gave good results, but in
unicode format.

What am i doing wrong?

Miki

Miki Watts · Feb 22, 2004

Well, i managed to find a solution of some sort:

System.Text.Encoding e = System.Text.Encoding.GetEncoding("iso-8859-1");
output = BitConverter.ToString(e.GetBytes(FieldContent)).Replace("-"," ");

Is there something equivalent to the iso-8859-1 codepage?

Miki

Mihai N. · Feb 22, 2004

Is there something equivalent to the iso-8859-1 codepage?
1252 is the MS equivalent (it is in fact iso-8859-1 with some extras)

Jon Skeet [C# MVP] · Feb 22, 2004

Miki Watts said:
I'm having a problem with encoding a string... here's my code:

byte[] s = System.Text.Encoding.ASCII.GetBytes(FieldContent);

Now, this works fine, as long as there are no bytes that are over 128, for
example, a 0x99 byte turns out as 0x3f byte.
I know that ASCII is just 7 bits, but i tried the other encoding formats,
and they didn't get me what i needed... UTF7 did the same thing as ASCII,
UTF8 gave 0xC299 for each 0x99 byte, and UNICODE gave good results, but in
unicode format.

What am i doing wrong?

Nothing. What do you think it's doing wrong? It's doing exactly what it
should be - it's encoding your text in the various different ways,
depending on the encoding type used.

See http://www.pobox.com/~skeet/csharp/unicode.html for more
information.

Jon Skeet [C# MVP] · Feb 22, 2004

Mihai N. said:
1252 is the MS equivalent (it is in fact iso-8859-1 with some extras)

Sort of - using the 8859-1 code page, you'll actually end up with bytes
effectively being "passed through", even if they shouldn't really be.
(I'm talking about characters 128-139 IIRC.) Code page 1252 has
entirely different characters in that range (the extras you mean).

If the OP wants 8859-1, he can just use the form he's already shown, or
ask for codepage number 28591. It's not a good idea though, if he's
basically using it to treat a string as sequence of bytes instead of
chars.

Miki Watts · Feb 22, 2004

If the OP wants 8859-1, he can just use the form he's already shown, or

ask for codepage number 28591. It's not a good idea though, if he's
basically using it to treat a string as sequence of bytes instead of
chars.

well, yes, basically, that is what i want to do, i want a string (i.e.
dynamic resize) that contains the exact bytes that i want, without
interpetation or encoding. I haven't found any other construct that can do
this for me though. (byte[] should be what i need, but it's not dynamic).

Jon Skeet [C# MVP] · Feb 22, 2004

Miki Watts said:
well, yes, basically, that is what i want to do, i want a string (i.e.
dynamic resize) that contains the exact bytes that i want

Strings don't contain bytes. They contain characters. You shouldn't use
them for binary data - that's not what they're designed for.

without
interpetation or encoding. I haven't found any other construct that can do
this for me though. (byte[] should be what i need, but it's not dynamic).

String itself isn't dynamic either - once created, a string is fixed.
It just has methods to make it easy to create a new string with (say)
the value of two strings concatenated.

I suspect that MemoryStream might be helpful to you though.

Miki Watts · Feb 22, 2004

I suspect that MemoryStream might be helpful to you though.

ok, thanks. I'll check it out.

This spanish character string "ñ" cause something that I don't understand	7	Mar 31, 2010
Determine File Encoding	10	Jun 1, 2005
Converting text and detecting encoding	3	Jul 4, 2006
UTF8/UTF7/ASCII problem while reading from text file	5	Aug 6, 2004
C# and encodings	30	Feb 3, 2009
encoding question	7	Jan 11, 2005
I'm using about twice as many bytes of memory as the size of the file	8	Mar 4, 2010
Help!! Convert file encoding	2	Sep 2, 2008

Odd string encoding behaviour

Miki Watts

Miki Watts

Mihai N.

Jon Skeet [C# MVP]

Jon Skeet [C# MVP]

Miki Watts

Jon Skeet [C# MVP]

Miki Watts

Ask a Question

Similar Threads