requestEncoding = "ISO-8859-1"

M

Mark

In our web.config, we have changed the first line below to look like the
second:

OLD: <globalization requestEncoding="utf-8" responseEncoding="utf-8" />
NEW: <globalization requestEncoding="ISO-8859-1"
responseEncoding="ISO-8859-1" />

What characters are we excluding from working properly and/or what problems
might we encounter by making this change to our site? Practical examples
would be appreciated.

Thanks in advance.

Mark
 
J

Joerg Jooss

Thus wrote Mark,
In our web.config, we have changed the first line below to look like
the second:

OLD: <globalization requestEncoding="utf-8" responseEncoding="utf-8"
/> NEW: <globalization requestEncoding="ISO-8859-1"
responseEncoding="ISO-8859-1" />

What characters are we excluding from working properly and/or what
problems might we encounter by making this change to our site?

ISO-8859-1 is dated. It doesn't even include the Euro sign. Alas, the French
even forgot to include one or two of their special characters ;-)

I wonder why you would ever want to switch from UTF-8 to ISO-8859-x unless
you'd have to support some ancient web client that doesn't know Unicode...

Cheers,
 
M

Marc Scheuner

In our web.config, we have changed the first line below to look like the
second:
OLD: <globalization requestEncoding="utf-8" responseEncoding="utf-8" />
NEW: <globalization requestEncoding="ISO-8859-1"
responseEncoding="ISO-8859-1" />

What characters are we excluding from working properly and/or what problems
might we encounter by making this change to our site? Practical examples
would be appreciated.

ISO-8859-1 is the "Western European" character set, so for all western
European languages such as English, Spanish, German, French, Italian
and so forth, you should be fine. This corresponds to what is often
referred to as "extended ASCII".

You might run into trouble with Eastern European languages (Czech,
Slovak, Polish, Hungarian, Russian) and so forth - not sure about
Greek...... and of course, any languages using a non-Latin alphabet
(such as Arabic, Hebrew, and the many many Asian languages) will be
entirely excluded.

HTH
Marc
 
M

Mark

I would love to remain with UTF-8 as well, but the scenario below appears to
require ISO-8859-x.

* Create a new web project with an .HTM page and a .ASPX page using all the
defaults of VS.NET 2003.
* Put an HTML form on the HTM page. Send the results of the form submission
to the .ASPX page.
* If the form includes any fun accents (Spanish names for example), the
Request object on the .aspx page nukes them. The code below illustrates
this.
* The client I'm using to test this is the latest patched version of IE
(6.0.29.. SP2)

string Test1 = Request["FirstName"];

It was my understanding that UTF-8 should be all encompassing, allowing for
all of these other characters??? I'm very confused ...

Thanks in advance, and thanks for your initial reply.

Mark
 
J

Joerg Jooss

Thus wrote Mark,
I would love to remain with UTF-8 as well, but the scenario below
appears to require ISO-8859-x.

* Create a new web project with an .HTM page and a .ASPX page using
all the
defaults of VS.NET 2003.
* Put an HTML form on the HTM page. Send the results of the form
submission
to the .ASPX page.
* If the form includes any fun accents (Spanish names for example),
the
Request object on the .aspx page nukes them. The code below
illustrates
this.
* The client I'm using to test this is the latest patched version of
IE
(6.0.29.. SP2)
string Test1 = Request["FirstName"];

It was my understanding that UTF-8 should be all encompassing,
allowing for all of these other characters??? I'm very confused ...

Don't be. All that happens is that your HTML form likely doesn't specify
any encoding, thus your browser assumes ISO-8859-1 by default (check your
browser's encoding option after loading the form!).

HTML 4.01 even includes an attribute to specify the character encoding for
a submitted web form ("accept-charset"), but last time I checked it was practically
unsupported. What happens instead is that the original response encoding
(ISO-8859-1 assumed if not specified) is used as subsequent request encoding.


Therefore, simply mark your static HTML as UTF-8 encoded as well (and of
course encode them physically using UTF-8 as well!):

<META http-equiv="Content-Type" content="text/html; charset=UTF-8">

That should do the trick.
 
J

Juan T. Llibre

re:
It was my understanding that UTF-8 should be all encompassing, allowing for all of these other
characters?

I've *never* been able to get utf-8 to display
all the characters necessary to write in spanish.

ISO-8859-1 displays them without breaking a sweat.



Juan T. Llibre
ASP.NET MVP
ASPNETFAQ.COM : http://www.aspnetfaq.com
==================================
Mark said:
I would love to remain with UTF-8 as well, but the scenario below appears to require ISO-8859-x.

* Create a new web project with an .HTM page and a .ASPX page using all the defaults of VS.NET
2003.
* Put an HTML form on the HTM page. Send the results of the form submission to the .ASPX page.
* If the form includes any fun accents (Spanish names for example), the Request object on the
.aspx page nukes them. The code below illustrates this.
* The client I'm using to test this is the latest patched version of IE (6.0.29.. SP2)

string Test1 = Request["FirstName"];

It was my understanding that UTF-8 should be all encompassing, allowing for all of these other
characters??? I'm very confused ...

Thanks in advance, and thanks for your initial reply.

Mark
 
J

Juan T. Llibre

Joerg,

We've had this conversation before, but never resolved it.

What I've found, in my experience, is that ASP.NET settings take
precedence over HTML settings and, thus, utf-8 doesn't display
characters 128-255 as you say itshould.

If I include
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
in an aspx page source, the setting which prevails is the one specified
in web.config.

Can you post a complete page example which proves what you're saying ?



Juan T. Llibre
ASP.NET MVP
ASPNETFAQ.COM : http://www.aspnetfaq.com
==================================
 
M

Mark

You wrote:
*** Therefore, simply mark your static HTML as UTF-8 encoded as well (and of
course encode them physically using UTF-8 as well!): ***

I placed the META tag in the HTML form file. Your sentence implies two
steps? Sorry - how does one encode them physicially using UTF-8 as well?

Thanks again.

Mark


Joerg Jooss said:
Thus wrote Mark,
I would love to remain with UTF-8 as well, but the scenario below
appears to require ISO-8859-x.

* Create a new web project with an .HTM page and a .ASPX page using
all the
defaults of VS.NET 2003.
* Put an HTML form on the HTM page. Send the results of the form
submission
to the .ASPX page.
* If the form includes any fun accents (Spanish names for example),
the
Request object on the .aspx page nukes them. The code below
illustrates
this.
* The client I'm using to test this is the latest patched version of
IE
(6.0.29.. SP2)
string Test1 = Request["FirstName"];

It was my understanding that UTF-8 should be all encompassing,
allowing for all of these other characters??? I'm very confused ...

Don't be. All that happens is that your HTML form likely doesn't specify
any encoding, thus your browser assumes ISO-8859-1 by default (check your
browser's encoding option after loading the form!).
HTML 4.01 even includes an attribute to specify the character encoding for
a submitted web form ("accept-charset"), but last time I checked it was
practically unsupported. What happens instead is that the original
response encoding (ISO-8859-1 assumed if not specified) is used as
subsequent request encoding.

Therefore, simply mark your static HTML as UTF-8 encoded as well (and of
course encode them physically using UTF-8 as well!):
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">

That should do the trick.
 
J

Juan T. Llibre

I think you're fooling yourself and using standard HTML, instead of ASPX.

If you remove the action attribute ( action="FormReceive.aspx" )
and set the encoding to utf-8 in web.config, you'll see that
the accented characters are *not* displayed.

ASPX forms don't need to have the action attribute set.

The code-behind you included isn't being processed at all, given that
you don't have a Page directive to indicate that it be processed.

So, while the META tag *does* work for the HTML which is processed
in that page, ASP.NET has nothing to do with that processing.

re:
I'm still baffled on what is the RIGHT solution.

For HTML what you did works, but not for ASPX.

For ASPX, the only way I've found which works is
setting the encoding to iso-8859-1 in web.config.

Setting it to utf-8 has never displayed those charcters for me.



Juan T. Llibre
ASP.NET MVP
ASPNETFAQ.COM : http://www.aspnetfaq.com
==================================
 
M

Mark

Thanks Juan. I apologize for not providing more detail. The HTML I
attached in the file was a .htm file. The code behind I sent was for
FormReceive.aspx.cs. I copied the code from the Page_Load method. I did
not send the HTML for FormReceive.aspx because it was a blank web page, and
only displays the Response.Write() output.

So, if I removed the action, the .HTM page does nothing at all. With the
action attribute, it redirects to my .aspx page, the code behind executes,
and displays the Spanish text.

Am I misunderstanding your comments below? If not, my quesiton remains "I'm
still baffled on what is the RIGHT solution." OPTIONS: 1) Setting it in
the web.config permanently to an old standard, or 2) hard coding in UTF-8
into all our HTML forms?

Thanks again.

Mark
 
J

Joerg Jooss

Thus wrote Juan,
Joerg,

We've had this conversation before, but never resolved it.

You never came back to me ;-)
What I've found, in my experience, is that ASP.NET settings take
precedence over HTML settings and, thus, utf-8 doesn't display
characters 128-255 as you say itshould.

That happens whenever actual request encoding and expected request encoding
don't match, and it breaks with practically any combination of encodings
other than those that happen to be compatible, like US-ASCII/actual and UTF-8/expected.


Whenever you find that for an expected encoding <e1> a web form's form data
is corrupted, but switching to encoding <e2> on the server-side solves the
problem, all that you do is adjust the server side to the client side. The
point is that the client should have used <e1>.

If I include
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
in an aspx page source, the setting which prevails is the one
specified
in web.config.

Yes, but this is completely irrelevant in case of Mark's problem. Maybe I
wasn't clear enough: With HTML form I meant the plain HTML file. That's where
you need a META-Tag, because that's the form that uses the wrong encoding.

Adding a META tag in an ASP.NET web forms isn't useful, as ASP.NET generates
an appropriate HTTP Content-Type header like "Content-Type: text/html; charset:
utf-8", which thankfully overrides any META tag.
Can you post a complete page example which proves what you're saying ?

Well, there's http://www.microsoft.com/spain ;-)

But I'll be happy to use that topic for my starving blog. Give me a week
or so, OK? I'll also revew that stuff we looked into last year.
 
J

Joerg Jooss

Thus wrote Mark,
You wrote:
*** Therefore, simply mark your static HTML as UTF-8 encoded as well
(and of
course encode them physically using UTF-8 as well!): ***
I placed the META tag in the HTML form file. Your sentence implies
two steps? Sorry - how does one encode them physicially using UTF-8
as well?

Mark, the second step is only required if your HTML file contains non-HTML
encoded non-ASCII.

Example: I can either use "J&ouml;rg" or "Jörg" to write my name as some
static HTML. But latter string requires an appropriate encoding for the HTML
source file.

Thus, I need to pick an encoding that is capable of representing the character
'ö' like UTF-8 or ISO-8859-1. When I save this file, I need to tell my HTML
editor this desired encoding. How this is done depends on the editor. Notepad
for example has an encoding drop-down in its "Save As" file dialog. Visual
Studio has an option for selecting an encoding in its File menu, as do SciTE
or UltraEdit as far as I remember.

But of course one thing always remains true: You must not declare in your
HTML page's META tag an encoding that's different from the one you used to
save the file. Of course the same compatibility rules as in the request/response
scenario apply, but it's best just to keep META declaration and actual file
encoding the same.

Cheers,
 
M

Mark

Thank you! I should mention that this whole discussion certainly makes me
wish I had an umlaut in my name. "Mark" is far to bland.

mark
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top