Unicode Encoding

G

Guest

Hi

I read http://dotgnu.org/pnetlib-doc/Syste...nicodeEncoding.GetBytes(System.String) Method and here is my code:

<code
string c = "Hello, How are you today?"; //it should be other language, not English

UnicodeEncoding Unicode = new UnicodeEncoding()

byte[] encodedBytes = Unicode.GetBytes(c)
</code

I saw the value of encodedBytes at breakpoint and it showed the value is "System.Byte[]".

Why did it not show the bytes value? such as 08F36AED2FGH341D....

How should I fix it? as I need to pass that byte array value to another server through https and that server only accept unicode with 2 bytes for a char.

Thanks
 
M

Morten Wennevik

Hi,

I read http://dotgnu.org/pnetlib-doc/Syste...nicodeEncoding.GetBytes(System.String) Method and here is my code:

<code>
string c = "Hello, How are you today?"; //it should be other language, not English.

UnicodeEncoding Unicode = new UnicodeEncoding();

byte[] encodedBytes = Unicode.GetBytes(c);
</code>

I saw the value of encodedBytes at breakpoint and it showed the value is "System.Byte[]".

Why did it not show the bytes value? such as 08F36AED2FGH341D....

because encodedBytes is a System.Byte[], it's an array of bytes. The values are in there. If the server expects an array of bytes, just send encodedBytes.

If you specifically need the 08F36AED2FGH341D which is a string, not a byte[] then you need to convert it manually.

byte[] encodedBytes = Unicode.GetBytes(c);
StringBuilder sb = new StringBuilder();

// for every byte, convert it to a hexadecimal string value,
// and make sure it's always 2 characters
for(int i = 0; i < encodedBytes.Length; i++)
{
sb.Append((Convert.ToString(encodedBytes, 16)).PadLeft(2, '0'));
}

// this will output the string like you wanted, in uppercase.
MessageBox.Show(sb.ToString().ToUpper());


Happy coding!
Morten Wennevik [C# MVP]
 
M

Mihai N.

string c = "Hello, How are you today?"; //it should be other language, not
You should make sure the system's default code page matches the one for your
string. You should not try to compile sources with Japanese text on an U.S.
system, for instance.
In general, hard-coding strings in the code is a bad idea.
You should use resources.
 
G

Guest

Hi

I tried to use the proposed code and it works fine. However, the targeting server cannot present the Chinese words properly. I gave the source code of the VB version here. And, what kind of problem it should be

C# code
<code
string c = "Chinese string ......."

UnicodeEncoding Unicode = new UnicodeEncoding()
byte[] encodedBytes = Unicode.GetBytes(c)

StringBuilder sb = new StringBuilder()

for(int i = 0; i < encodedBytes.Length; i++

sb.Append((Convert.ToString(encodedBytes, 16)).PadLeft(2, '0'))

MessageBox.Show(sb.ToString().ToUpper())
</code

Here is VB 6 code
<code
Public Function EnUnicode(Character As String) As Strin
Dim x() As Byt
Dim a As Lon
Dim i As Lon

a = Asc(Character

If a < 0 Or a > 127 The
'Double Byte Chinses Characte
x = StrConv(Character, vbWide, 3076) ' Convert string
For i = UBound(x) To 0 Step -
If x(i) < 16 The
EnUnicode = EnUnicode & "0" & Hex(x(i)
Els
EnUnicode = EnUnicode & Hex(x(i)
End I
Nex
Els
'ASCII Character (Single Byte
If a < 16 The
EnUnicode = EnUnicode & "000" & Hex(a
Els
EnUnicode = EnUnicode & "00" & Hex(a
End I

End I
End Functio
</code

Thanks for hel
 
E

Ed Courtenay

Tom said:
Hi,

I tried to use the proposed code and it works fine. However, the targeting server cannot present the Chinese words properly. I gave the source code of the VB version here. And, what kind of problem it should be?

C# code:
<code>
string c = "Chinese string .......";

UnicodeEncoding Unicode = new UnicodeEncoding();
byte[] encodedBytes = Unicode.GetBytes(c);

StringBuilder sb = new StringBuilder();

for(int i = 0; i < encodedBytes.Length; i++)
{
sb.Append((Convert.ToString(encodedBytes, 16)).PadLeft(2, '0'));
}
MessageBox.Show(sb.ToString().ToUpper());
</code>

Here is VB 6 code:
<code>
Public Function EnUnicode(Character As String) As String
Dim x() As Byte
Dim a As Long
Dim i As Long

a = Asc(Character)

If a < 0 Or a > 127 Then
'Double Byte Chinses Character
x = StrConv(Character, vbWide, 3076) ' Convert string.
For i = UBound(x) To 0 Step -1
If x(i) < 16 Then
EnUnicode = EnUnicode & "0" & Hex(x(i))
Else
EnUnicode = EnUnicode & Hex(x(i))
End If
Next
Else
'ASCII Character (Single Byte)
If a < 16 Then
EnUnicode = EnUnicode & "000" & Hex(a)
Else
EnUnicode = EnUnicode & "00" & Hex(a)
End If

End If
End Function
</code>


It looks to me like you should be using UTF-8 as opposed to Unicode
encoding.

Have a look at:

http://www.yoda.arachsys.com/csharp/unicode.html

and pay particular note to the differences between UTF-8 and UTF-16.

Oh, and pay homage to Jon Skeet for such a great resource!
 
M

Morten Wennevik

Ok, I'm not too familiar with Visual Basic, but it looks to me like it converts the string to ascii, in which case it destroys any unicode. Note, there is an AscW method for unicode, Asc doesn't support the full range of Unicode and since the chinese characters are located in the upper region it may not be able to handle the characters.

Also, it all seems a bit awkward when you already get the bytes properly with my method(I checked with chinese characters and converted it back to a string from the hex string value). It may be that the server expects unicode bytes in reverse order. In that case you need to use BigEndianUnicode.GetBytes(string);

// for every byte in the array
for(int i = 0; i < encodedBytes.Length; i++)
{
// add the string value to stringbuilder
// I could have used a string and just += new value
// but I suspect stringbuilder is faster for this
sb.Append(
// make a string out of the current byte at position i
// and use hexadecimal values
(Convert.ToString(encodedBytes, 16)).
// for one character strings (0-127)
// add a 0 at the front
PadLeft(2, '0')
);
}

Happy coding!
Morten Wennevik [C# MVP]
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top