encoding question

J

Jim Lawton

Hello,

tried this in framework.aspnet without any luck so far, maybe someone here has
a comment ...

TIA, Jim

..net c# httphandler straight html form at browser.

GBP pound sign problem (I know I know - I *can* decode it, but I've got to
understand what and why I should be doing stuff)

I am uploading text data from a form. This data is either directly input into a
textarea, or is a file stream originating from a .txt file, (or other basic text
file (like off Mac or Unix - of course I don't necessarily know at present it's
only .txt)

The page encoding is :-
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

On arrival at the server the content encoding is, sure enough UTF8.

Data input via the textarea and input to a string is displayed in the debugger
as pounds (£)

Data input as a filestream has in the stream single bytes containing 0xA3 for
the GBP pound sign.

I process the input stream like this :-

public static string StreamToString(Stream aStream)
{ {
aStream.Position = 0;
long i = aStream.Length;
byte[] buffer = new byte;

aStream.Read(buffer,0,(int)aStream.Length);
return BytesToUTF8String(buffer);
}

public static string BytesToUTF8String(byte[] Array)
{
Encoding utf8 = Encoding.UTF8;
char[] utf8Chars = new char[utf8.GetCharCount(Array, 0,Array.Length)];
utf8.GetChars(Array, 0, Array.Length, utf8Chars, 0);

return new string(utf8Chars);
}

The resulting string contains nothing ...

If I use ASCII instead of UTF8, I get sense except my GBP signs are query ?
marks.

If I use UTF7 I get an apparently OK decoding.

I am dubious about using UTF7 for no better reason than that it works. Is there
logic here? What should I be doing?

Thanks,
Jim
 
J

Jon Skeet [C# MVP]

I am dubious about using UTF7 for no better reason than that it works. Is there
logic here? What should I be doing?

You should probably be using Encoding.Default, or Encoding.GetEncoding
(28591) (i.e. ISO-8859-1).

You almost certainly *don't* want UTF-7 really.
 
J

Jon Skeet [C# MVP]

Jon Skeet said:
You should probably be using Encoding.Default, or Encoding.GetEncoding
(28591) (i.e. ISO-8859-1).

You almost certainly *don't* want UTF-7 really.

Thinking about it further, ISO-8859-1 won't work either - basically you
need to know the original encoding of the file. It may well be Windows
CP-1252, which will be what Encoding.Default will probably return if
you're in Western Europe or the US, unless you've changed the defaults,
but really you're still going to be at the whim of files which *aren't*
written with that encoding :(
 
J

Jon Skeet [C# MVP]

Jon Skeet said:
Thinking about it further, ISO-8859-1 won't work either

Apparently I didn't think about it enough. Ignore me - ISO-8859-1
*will* work for converting byte A3 into a pound sign. Whether it'll
work for the rest of the file depends on the contents of the file.
 
J

Jim Lawton

Apparently I didn't think about it enough. Ignore me - ISO-8859-1
*will* work for converting byte A3 into a pound sign. Whether it'll
work for the rest of the file depends on the contents of the file.


:) ... thanks for all your thoughts Jon - if nothing else it gives me some
confidence that the whole encoding issue is a can of worms! It smells a bit of
the old "DLL Hell" to me - all we need is a few bytes on the front of any file
to say what encoding it is, but we'll never get it!

Jim
 
J

Jim Lawton

Apparently I didn't think about it enough. Ignore me - ISO-8859-1
*will* work for converting byte A3 into a pound sign. Whether it'll
work for the rest of the file depends on the contents of the file.


I think that's right - works well enough ... I've inspected ("watched") the
contents of the request, and I can't see anything which relates to the encoding
of the bytestream - just text/plain so I'm down to guessing. Input will always
be from the UK ...

Cheers Jim
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Top