Question about unicode in internet explorer and notepad

G

Guest

I am a bit intrigued as to why unicode pages in internet explorer
apper as non unicode in the html source.

I need to explain what I mean :
I browsed in ie to a unicode page (non english text).
Then, I selected some text, copied, and pasted into
notepad.
Then, I save notepad, but when I do so I recieve a message
that some of the text is in unicode, and that unless I select
unicode, I will loose the text.
O.k. , this means the text that I copied from ie was unicode.
Now for the strange bit, from ie I choose view source.
A notepad window with the html source of the previous text
appears. I locate the same text in the source window, and
copy it to a new instance of notepad, and save. This time I
receive no unicode warning, i.e. the text is non unicode, it
is simply foreign ascii text in the range 224-255 as was
the case before the advent of unicode.

Can anyone explain how come the source is non unicode but
the displayed text is unicode?

I don't understand this....

Tia.
 
G

George Hester

Tia in HTML there is a meta tag that looks like this:

<meta http-equiv="Content-Type" content="text/html; charset=windows-1252"/>

This mean that the code page for the text in the HTML uses the standard ASCII character set - windows-1252

In a Japanese page this might be in the HTML code:

<meta http-equiv="Content-Type" content="text/html;charset=x-euc-jp"/>

This uses the code page for the HTML x-euc-jp

Because the browser can understand this meta tag you get Unicode characters in the displayed HTML. If you did not have this codepage then you'd get box-like characters signifying your browser doesn't support that code page.

Notepad is an ASCII text application. The fact that you can have Unicode in Notepad is relatively new. But it always defaults to ASCII and in fact always shows characters 1 byte (ASCII). Unicode takes 2 bytes. You can save in Unicode format but you cannot display in Unicode format with Notepad. But if you save in Unicode in Notepad and use an Application which can display 2-byte characters you would see the Unicode just as you saw it on the Web.

HTH
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top