Character encoding - 1252 vs. ISO-8859-1

JS · Mar 17, 2006

I was wondering why one would specify character encoding of 1252 vs.
ISO-8859-1 when retrieving data via HTTP. My circumstance is that I am
retrieving XML via HTTP with French characters in it and I have
specified the encoding as follows:

Dim str as New StreamReader([data source],
system.text.encoding.getencoding("ISO-8859-1"))

Doing this works fine and I retrieve the data without the special
French characters being dropped. When I change the above line of code
to the following:

Dim str as New StreamReader([data source],
System.Text.Encoding.GetEncoding(1252))

The end result is the same.

Is there any advantage to one encoding over another?

Joerg Jooss · Mar 17, 2006

Thus wrote js,

I was wondering why one would specify character encoding of 1252 vs.
ISO-8859-1 when retrieving data via HTTP. My circumstance is that I
am retrieving XML via HTTP with French characters in it and I have
specified the encoding as follows:

Dim str as New StreamReader([data source],
system.text.encoding.getencoding("ISO-8859-1"))
Doing this works fine and I retrieve the data without the special
French characters being dropped. When I change the above line of code
to the following:

Dim str as New StreamReader([data source],
System.Text.Encoding.GetEncoding(1252))
The end result is the same.

Is there any advantage to one encoding over another?

Well, both are dated. Windows-1252 is actually an extension of ISO-8859-1.
See http://www.microsoft.com/globaldev/reference/sbcs/1252.mspx and http://www.microsoft.com/globaldev/reference/iso/28591.mspx.
ISO-8859-1 does not contain €, nor the uppercase and lowercase "oe" ligature
(Unicode \u0152 and \u0153). Windows-1252 contains both.

Modern applications should rather use one of the Unicode Transformation Formats
like UTF-8.

Cheers,

JS · Mar 17, 2006

Well, both are dated. Windows-1252 is actually an extension of

ISO-8859-1. See
http://www.microsoft.com/globaldev/reference/sbcs/1252.mspx and
http://www.microsoft.com/globaldev/reference/iso/28591.mspx.
ISO-8859-1 does not contain €, nor the uppercase and
lowercase "oe" ligature (Unicode \u0152 and \u0153).
Windows-1252 contains both.

Modern applications should rather use one of the Unicode
Transformation Formats like UTF-8.

Okay, that is what I was thinking (in terms of the difference between
the two of them) when I was researching the issue but figured that
there must be something else I was missing. Unfortunately I cannot get
our remote partners to switch to UTF-8 (or something else more current)
so I am stuck with it but at least I feel comfortable with what I am
doing.

Thank you Joerg; great informations and assistance as always.

J.

Encoding to ISO-8859-1 problems	6	Feb 1, 2007
Windows Mobile 5.0 + ISO-8859-1 Encoding Issue	1	Oct 30, 2008
iso-8859-1 in webservice-responses	1	Oct 17, 2004
Euro symbol, ISO-8859-1, and XML	3	Apr 23, 2004
httpWebRequest.GetResponse.GetResponseStream decoding	3	Sep 1, 2003
generating multipart/alternative mails with system.net.mail, "double encapsulation issue"	1	Feb 7, 2007
500 error when trying to do a post	1	May 10, 2004
System.Text.Encoding.GetEncoding("iso-8859-1") throws PlatformNotSupportedException?	0	Dec 15, 2008

Character encoding - 1252 vs. ISO-8859-1

JS

Joerg Jooss

JS

Ask a Question

Similar Threads