Help: Unicode: UTF-8 CMD prompt

L

L. Athena W.

I'm discombombulated...it's a simple thing, really, .... I thought ....
but how does one display unicode (UTF-8) encoding on the command prompt.

I can display them in notepad, wordpad, a HTMl browser,
I tried both non TT courier and unicode "New Courier".

None of them dipslay properly in in the Windows command window.

I thought Win-2000 and above was supposed to use UNICODE as the basic
character set and. I'm using the unicode TT "Lucida console" for display
in my CMD windows ---
-----------------------
Which brings me to
2) a second less troublsome point, but still annoying...

Why doesn't cmd allow me to choose New Courier as a monospace font:
Some monospace fonts (at least the appear to be) on my system:
Tt @MS Mincho
Ot Andale Mono
Tt Bordofixed
Courier
Ot Courier New
Tt Dactylographe EXp
Fixsys
Tt HE_TERMINAL
Tt HyperFont
Hyperfont Dk.
Hyperfront Lt.
Ot Lucia Console
Ot Lucida Sans Typewriter
Tt Monspace 821 BT
Tt Monspace 821 DT
Tt MS Mincho
Tt Normafixed
Tt OCRB
Tt Oloron 437
Tt Oloron <Several variants>
Terminal
vt100

Most are Ot or Tt fonts that, I would think, Should be suitable for a tty. For
that matter why can't proportional fonts be used on monoTTY's, and space simply
allocated for the largest Character (with space filling out the space between
thinner characters).

Anyway...main dig, is why no unicode on a unicode based OS -- also, a minor
third dig -- if I use an "alternate keyboard", like select french keyboard, I
can enter _some_ unicode characters with key combinations like "`"+"a" = "à",
but if I go into the keyboard map probram, it says I should be able to hold down
ALT and press 0224 on the keypad. But I get nothing (like trying in this
compose window or a winpad/textpad window...(and of course not in a cmd windows).

seems to be the only mono-space font (other than raster fonts) allowed for me
to choose. I know that New-Courier is monospace -- why won't the cmdline/window
allow me to choose, another monospace unicode characterset?

Any ideas on unicode display & input at the cmd prompt?

Thanks,
-linda
 
P

Paul R. Sadowski [MVP]

L. Athena W. said:
I'm discombombulated...it's a simple thing, really, .... I thought ....
but how does one display unicode (UTF-8) encoding on the command prompt.

Have you tried
chcp 10000
 
M

Michael Bednarek

On Mon, 18 Oct 2004 16:58:51 -0700, "L. Athena W."
microsoft.public.win2000.cmdprompt.admin,
microsoft.public.platformsdk.mslayerforunicode:
I'm discombombulated...it's a simple thing, really, .... I thought ....
but how does one display unicode (UTF-8) encoding on the command prompt.

I can display them in notepad, wordpad, a HTMl browser,
I tried both non TT courier and unicode "New Courier".

None of them dipslay properly in in the Windows command window.

Which font do you use in the console window? Which code page?
Here, with "Raster Fonts" or "Lucida Console" and others, and code page
850, they do.
I thought Win-2000 and above was supposed to use UNICODE as the basic
character set and. I'm using the unicode TT "Lucida console" for display
in my CMD windows ---
-----------------------
Which brings me to
2) a second less troublsome point, but still annoying...

Why doesn't cmd allow me to choose New Courier as a monospace font:
Some monospace fonts (at least the appear to be) on my system:
[snip]

seems to be the only mono-space font (other than raster fonts) allowed for me
to choose. I know that New-Courier is monospace -- why won't the cmdline/window
allow me to choose, another monospace unicode characterset?

Any ideas on unicode display & input at the cmd prompt?

See <http://support.microsoft.com/default.aspx?scid=KB;EN-US;Q247815>.

Summary: Create entries in registry key:
HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Console\TrueTypeFont

Here's mine:

[HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Console\TrueTypeFont]
"0"="Lucida Console"
"00"="Andale Mono"
"000"="Monofonto"
 
H

He Shiming

My pratical experience suggested that the console (on Windows NT) supports
Unicode, and however, due to the default font settings, you aren't gonna get
real multilingual texts displayed like the texts in the GUI.

And one thing though. Usually the console cannot differ between a UTF-8
sequence and a local character encoding sequences. They are basically just a
series of single-byte characters, and they are in the same representation.
To display them correctly, you need to convert UTF-8 representation into
Unicode wide-characters. One API that's known to do the job is
MultiByteToWideChar. Use CP_UTF8 as the code page setting. Once you've got
the wchar_t or WCHAR sequences, you can print them with "wprintf" or wcout
stream in C++.

--
He Shiming / www.imediaman.com


L. Athena W. said:
I'm discombombulated...it's a simple thing, really, .... I thought ....
but how does one display unicode (UTF-8) encoding on the command prompt.

I can display them in notepad, wordpad, a HTMl browser,
I tried both non TT courier and unicode "New Courier".

None of them dipslay properly in in the Windows command window.

I thought Win-2000 and above was supposed to use UNICODE as the basic
character set and. I'm using the unicode TT "Lucida console" for display
in my CMD windows ---
-----------------------
Which brings me to
2) a second less troublsome point, but still annoying...

Why doesn't cmd allow me to choose New Courier as a monospace font:
Some monospace fonts (at least the appear to be) on my system:
Tt @MS Mincho
Ot Andale Mono
Tt Bordofixed
Courier
Ot Courier New
Tt Dactylographe EXp
Fixsys
Tt HE_TERMINAL
Tt HyperFont
Hyperfont Dk.
Hyperfront Lt.
Ot Lucia Console
Ot Lucida Sans Typewriter
Tt Monspace 821 BT
Tt Monspace 821 DT
Tt MS Mincho
Tt Normafixed
Tt OCRB
Tt Oloron 437
Tt Oloron <Several variants>
Terminal
vt100

Most are Ot or Tt fonts that, I would think, Should be suitable for a tty.
For
that matter why can't proportional fonts be used on monoTTY's, and space
simply
allocated for the largest Character (with space filling out the space
between thinner characters).

Anyway...main dig, is why no unicode on a unicode based OS -- also, a
minor third dig -- if I use an "alternate keyboard", like select french
keyboard, I can enter _some_ unicode characters with key combinations like
"`"+"a" = "¨¤", but if I go into the keyboard map probram, it says I
should be able to hold down
ALT and press 0224 on the keypad. But I get nothing (like trying in this
compose window or a winpad/textpad window...(and of course not in a cmd
windows).

seems to be the only mono-space font (other than raster fonts) allowed for
me
to choose. I know that New-Courier is monospace -- why won't the
cmdline/window
allow me to choose, another monospace unicode characterset?

Any ideas on unicode display & input at the cmd prompt?

Thanks,
-linda

---
Email: to send me email,i have two ways to get the answer (neither,
I terribly penetrable by auto email collectors...:)
1) if my domain, 'earthlink' was 'tlinx' and 'net' was 'org' that'd work
2) Apply 's/@[a-hrt]+h/@t/ ; y/gkor/txne/;'" (sed)
 
L

L. Athena W.

Michael said:
On Mon, 18 Oct 2004 16:58:51 -0700, "L. Athena W."
microsoft.public.win2000.cmdprompt.admin,
microsoft.public.platformsdk.mslayerforunicode:




Which font do you use in the console window? Which code page?
Here, with "Raster Fonts" or "Lucida Console" and others, and code page
850, they do.
----
For some reason, my default code page is set to 437. Where is the default set?
Is in among the system env vars or is it stored on a per-shell
command-shortcut basis?

(ps - rot13?! Good thing I have a linux box handy :))
 
M

Michael Bednarek

On Tue, 07 Dec 2004 18:42:14 -0800, "L. Athena W."
microsoft.public.win2000.cmdprompt.admin,
microsoft.public.platformsdk.mslayerforunicode:

The only two methods I know to set the Code Page are either with the
CHCP command or in this Registry Key (REG_SZ):
HKLM\SYSTEM\CurrentControlSet\Control\Nls\CodePage\OEMCP
There must a GUI method, but I don't know it.

At home, my Code Page is 850, at work it's 437, both using Raster Fonts
in the console and both show accented and umlaut characters.

This thread started six weeks ago - what is the source of your
discombobulation again?
(ps - rot13?! Good thing I have a linux box handy :))

Nal qrprag arjfernqre fubhyq cebivqr ebg13 - Ntrag qbrf vg urer.
 
M

Michael \(michka\) Kaplan [MS]

Michael Bednarek said:
The only two methods I know to set the Code Page are either with the
CHCP command

This will change the output console for the code page.
or in this Registry Key (REG_SZ):
HKLM\SYSTEM\CurrentControlSet\Control\Nls\CodePage\OEMCP

This is yje CP_OEMCP. Never change this directly! Always change the "default
system locale" (aka the "language for non-Unicode programs"). It must be
kept in sync with the CP_ACP or bad things can happen to your machine.
There must a GUI method, but I don't know it.

See above. :)
At home, my Code Page is 850, at work it's 437, both using Raster Fonts
in the console and both show accented and umlaut characters.

850 is one of the European OEMCPs, while 437 is the US one.


--
MichKa [MS]
NLS Collation/Locale/Keyboard Technical Lead
Globalization Infrastructure, Fonts, and Tools
Microsoft Windows International Division

This posting is provided "AS IS" with
no warranties, and confers no rights.
 
Top