UTF-16

T

Tony Johansson

Hello!

Here is some text and it says somewhere in the middle that "a
code point is encoded into a sequence of one or more 16-bit values".

I mean that if you use UTF-16 a code point is encoded into a sequence of a
16 bit value not as the text says
a sequesnce of one or more 16 bit values ?

The Unicode Standard identifies each Unicode character with a unique 21-bit
scalar number called a code point, and defines the UTF-16 encoding form that
specifies how a
code point is encoded into a sequence of one or more 16-bit values. Each
16-bit value ranges
from hexadecimal 0x0000 through 0xFFFF and is stored in a Char structure.
The value of a
Char object is its 16-bit numeric (ordinal) value.

//Tony
 
A

Arne Vajhøj

Here is some text and it says somewhere in the middle that "a
code point is encoded into a sequence of one or more 16-bit values".

I mean that if you use UTF-16 a code point is encoded into a sequence of a
16 bit value not as the text says
a sequesnce of one or more 16 bit values ?

The Unicode Standard identifies each Unicode character with a unique 21-bit
scalar number called a code point, and defines the UTF-16 encoding form that
specifies how a
code point is encoded into a sequence of one or more 16-bit values. Each
16-bit value ranges
from hexadecimal 0x0000 through 0xFFFF and is stored in a Char structure.
The value of a
Char object is its 16-bit numeric (ordinal) value.

When Unicode and UTF-16 were designed there were less than
65536 values. So each code point could always be in in
one 16 bit value.

When they passed the 65536, then they had to use
two 16 bit values for certain code point.

You may work as a software developer 40 years in Sweden
and never see a code point requiring two 16 bit values
in real life.

Arne
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top