The char datatype

T

Tony Johansson

Hello!

A char represents a singe 16 bits (Unicode) character.

Assume that a chines want to write a chinese character in a char datatype
that doesn't fit in 16 bit what happens then.

//Tony
 
A

Arne Vajhøj

A char represents a singe 16 bits (Unicode) character.

Assume that a chines want to write a chinese character in a char datatype
that doesn't fit in 16 bit what happens then.

It gets encoded in 2 chars.

Arne
 
T

Tony Johansson

Felix Palmen said:
* Tony Johansson said:
A char represents a singe 16 bits (Unicode) character.

Assume that a chines want to write a chinese character in a char datatype
that doesn't fit in 16 bit what happens then.

.NET uses the UTF-16 encoding by default. This is different from UCS-2
in that it allows multi-word sequences (just like UTF-8 allows
multi-byte sequences), so it's possible to represent ANY unicode
character in a .NET string (but of course, not necessarily in a single
character).

Regards,
Felix

--
Felix Palmen (Zirias) + [PGP] Felix Palmen <[email protected]>
web: http://palmen-it.de/ | http://palmen-it.de/pub.txt
my open source projects: | Fingerprint: ED9B 62D0 BE39 32F9 2488
http://palmen-it.de/?pg=pro + 5D0C 8177 9D80 5ECF F683

But according to the definition for Unicode can some character be encoded in
more then 16 bits what would
then happen for a char that only use 16 bits.

If every character in the whole would fits into 16 bits why does the
definiion for Unicode use more then 16 bits fpr some character ?

//Tony
 
A

Arne Vajhøj

But according to the definition for Unicode can some character be encoded in
more then 16 bits what would
then happen for a char that only use 16 bits.

If every character in the whole would fits into 16 bits why does the
definiion for Unicode use more then 16 bits fpr some character ?

Not all code points can be in 16 bits.

Those that require more than 16 bits will be stored in
two C# chars.

Arne
 
F

Felix Palmen

* Tony Johansson said:
But according to the definition for Unicode can some character be encoded in
more then 16 bits what would
then happen for a char that only use 16 bits.

Google: UTF-8, UTF-16, multibyte-sequence, ...

A char is a char and will always be 16 bit in .NET. You need a string
for UTF-16 to work properly.
 
M

Markus Schaber

Hallo, Tony,

Tony Johansson said:
A char represents a singe 16 bits (Unicode) character.

Assume that a chines want to write a chinese character in a char datatype
that doesn't fit in 16 bit what happens then.

Arne and Felix already explained it in brief,
http://en.wikipedia.org/wiki/Utf-16 contains an in-depth explanation.


Gruss,
Markus
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top