Unicode Compression?

P

Paul Fenton

What is it and what does it do? I see it as a property in text and
memo fields. Sometimes it's 'Yes' and sometimes it's 'No'. What are
the rules for determining those parameters? (Acces 2000)



Paul Fenton
 
A

Allen Browne

In the old days, we used one byte (8 bits) to store a character. 255
possible characters was plenty for all upper and lower case characters,
numbers, punctuation, symbols, and control characters. But it was not enough
for foreign languages such as Chinese characters.

Unicode uses two bytes to store a character, so it can handle 64k different
characters, but it also means that you are using twice as much storage
space. For English, every second character is unused. Unicode compression
addresses this so it does not use two bytes of disk space for every text
character in your database.

I'm not aware of any rules on which setting to use. Without testing, I would
expect the extra processor time (uncompressing the data) to be almost
completely offset by the smaller disk read. I have no idea whether that
would hold for disk writes as well. There are several other factors that
completely change the picture, such as whether the database is encrypted,
whether data is coming across a network, whether the processing is being
done on the server (e.g. Terminal Server), ...

You also allude to the fact that Access does not set this property
consistently. Whether it defaults to Yes or No depends whether you created
the field through the interface, via DAO, with a DDL query, or through ADOX.
Unfortunately, there is just no consistency with this, but it really does
not matter to the integrity of the data in the same way as the
AllowZeroLength setting does (which is also set inconsistently).
 
S

Sylvain Lafontaine

The first time that I have seen that, it was my understanding that the
uncompressed Unicode format was UTF16 and the compressed one was UTF8.
(UTF8 will compress the data for latin alphabets with no or few accents,
like english or french, but it will have the opposite effect for other
languages. Also, contrary to UTF16, UTF8 is not really a universally
adopted standard and may not be understood by many Unicode applications.)

However, I never took the time to verify that assumption.

S. L.
 
P

Paul Fenton

Thank you Allen. I did find that MS recommends the parameter be set
to "Yes" to improve performance.


Paul Fenton
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top