French characters in query string

D

darin dimitrov

Hello,

How can I convert an url encoded string containing some french
characters back to the original string?

I have the following html form:
<form>
name = <input type="text" name="name" value="Noël Pirès"/>
<input type="submit" value="OK"/>
</form>

When I submit this form the following data is sent at protocol level
using HTTP GET: name=No%EBl+Pir%E8s

I tried converting it with the HttpUtility.UrlDecode method in order
to obtain the original string but with no success. This method
returned
"name=Nol Pirs" and the special french characters are omitted. What
function should I use in order to decode such a string.

In fact I would like to implement the following: I have a text file
containing the urlencoded string and in a console application I want
to make the conversion, something like:

------------------------
FileStream fin = new FileStream(@"c:\test.txt", FileMode.Open,
FileAccess.read);
byte[] buf = new byte[fin.Length];
fin.Read(buf, 0, (int)buf.Length);
string s = Encoding.Default.GetString(buf);

s = HttpUtility.UrlDecode(s);
System.Console.Write(s);
 
J

Jon Skeet [C# MVP]

darin dimitrov said:
How can I convert an url encoded string containing some french
characters back to the original string?

As far as I know, there's not much in the way of standardisation for
what URL encodings mean above the top of ASCII. The most robust way of
proceeding, IMO, is to use post data instead.
 
D

darin dimitrov

Jon Skeet said:
As far as I know, there's not much in the way of standardisation for
what URL encodings mean above the top of ASCII. The most robust way of
proceeding, IMO, is to use post data instead.


I will post the solution I found myself in case that someone else
might be interested:

The problem was that when I use the HttpUtility.UrlDecode methode
there is no way to specify the encoding, so instead I used the
HttpUtility.UrlDecodeToBytes method like:

<code>
string s = "name=No%EBl+Pir%E8s";
byte[] binaryData = HttpUtility.UrlDecodeToBytes(s,
Encoding.Default);
s = Encoding.Default.GetString(binaryData);
</code>

Now s = "name=Noël Pirès"
 
J

Jon Skeet [C# MVP]

darin dimitrov said:
As far as I know, there's not much in the way of standardisation for
what URL encodings mean above the top of ASCII. The most robust way of
proceeding, IMO, is to use post data instead.

I will post the solution I found myself in case that someone else
might be interested:

The problem was that when I use the HttpUtility.UrlDecode methode
there is no way to specify the encoding, so instead I used the
HttpUtility.UrlDecodeToBytes method like:

<code>
string s = "name=No%EBl+Pir%E8s";
byte[] binaryData = HttpUtility.UrlDecodeToBytes(s,
Encoding.Default);
s = Encoding.Default.GetString(binaryData);
</code>

Now s = "name=Noël Pirès"

That's fine so long as you know that the client was using the same
encoding - but do you?
 
D

darin dimitrov

Jon Skeet said:
darin dimitrov said:
As far as I know, there's not much in the way of standardisation for
what URL encodings mean above the top of ASCII. The most robust way of
proceeding, IMO, is to use post data instead.

I will post the solution I found myself in case that someone else
might be interested:

The problem was that when I use the HttpUtility.UrlDecode methode
there is no way to specify the encoding, so instead I used the
HttpUtility.UrlDecodeToBytes method like:

<code>
string s = "name=No%EBl+Pir%E8s";
byte[] binaryData = HttpUtility.UrlDecodeToBytes(s,
Encoding.Default);
s = Encoding.Default.GetString(binaryData);
</code>

Now s = "name=No l Pir s"

That's fine so long as you know that the client was using the same
encoding - but do you?


In my particular case I know the encoding that my clients are using,
but I agree that this is not a general solution. I would appreciate
any better suggestions.
 
J

Jon Skeet [C# MVP]

darin dimitrov said:
In my particular case I know the encoding that my clients are using,
but I agree that this is not a general solution. I would appreciate
any better suggestions.

As I said before, a much better way of passing data is in the body of
the request, as that has an encoding associated with it.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top