ASCII, StreamWriter and swedish letters

  • Thread starter Thread starter MA
  • Start date Start date
M

MA

Hi all!

I have a major problem. I need to write an textfile with 1 b per letter. But
it should be able to handle swedish letters to (åäö).
Is it possible to use 8 b ASCII for this?
This file is used by an sms application and cannot be in another format.

/Marre
 
MA said:
I have a major problem. I need to write an textfile with 1 b per letter. But
it should be able to handle swedish letters to (åäö).
Is it possible to use 8 b ASCII for this?

There's no such thing as "8 bit ASCII" (assuming that's what you meant
by "b").
This file is used by an sms application and cannot be in another
format.

You need to find out *exactly* what encoding will be used. There are
various 8 bit character sets which are compatible with ASCII in the
range 0-127, but which are incompatible with each other above 127. If
you can find out which of those your app needs to output, it should be
easy to find the appropriate Encoding to give to your StreamWriter.
 
You could also use cp1252, which supports Swedish well.

Or even better Unicode, which supports everything.


--
MichKa [MS]
NLS Collation/Locale/Keyboard Development
Globalization Infrastructure and Font Technologies
Windows International Division

This posting is provided "AS IS" with
no warranties, and confers no rights.


Morten Wennevik said:
Hi Marre,

I believe Swedish uses the standard European characterset ISO-8859-1, or
you could use ISO-8859-15 which is the nordic set (no å and æ in 8859-1, so
I would recomment the latter)
 
You could also use cp1252, which supports Swedish well.

Or even better Unicode, which supports everything.

Well, Unicode wouldn't be 8-bit, now would it :P
 
UTF-8 works with bytes....


--
MichKa [MS]
NLS Collation/Locale/Keyboard Development
Globalization Infrastructure and Font Technologies
Windows International Division

This posting is provided "AS IS" with
no warranties, and confers no rights.



 
Michael (michka) Kaplan said:
UTF-8 works with bytes....

Well, I wouldn't say it's an "8-bit encoding" in the normal sense of
the phrase...

Not every character in the set (in fact, very few!) can be represented
as a single byte.
 
Hi Marre,

I believe Swedish uses the standard European characterset ISO-8859-1, or you could use ISO-8859-15 which is the nordic set (no å and æ in 8859-1, so I would recomment the latter)

System.IO.StreamWriter sw = new System.IO.StreamWriter(path);
sw.Encoding = System.Text.Encoding.GetEncoding("ISO-8859-15");
 
Jon Skeet said:
Not every character in the set (in fact, very few!) can be represented
as a single byte.

Yes -- but Unicode covers a lot of ground. Attempts to do less lead to
corruption of text in the "lesser" code page, and I am reasonably certasin
that such corruption is never a good thing....

:-)

--
MichKa [MS]
NLS Collation/Locale/Keyboard Development
Globalization Infrastructure and Font Technologies
Windows International Division

This posting is provided "AS IS" with
no warranties, and confers no rights.
 
Michael (michka) Kaplan said:
Yes -- but Unicode covers a lot of ground. Attempts to do less lead to
corruption of text in the "lesser" code page, and I am reasonably certasin
that such corruption is never a good thing....

That's pretty much irrelevant when the encoding is fixed to start with
though, as the OP says it is. The best you can do is detect that you're
trying to write a character which isn't in the target character set,
and either throw an exception or write something else (eg '?').
 
MA said:
I have a major problem. I need to write an textfile with 1 b per letter.
But
it should be able to handle swedish letters to (åäö).
Is it possible to use 8 b ASCII for this?

There's no such thing as "8 bit ASCII" (assuming that's what you meant
by "b").
This file is used by an sms application and cannot be in another
format.

You need to find out *exactly* what encoding will be used. There are
various 8 bit character sets which are compatible with ASCII in the
range 0-127, but which are incompatible with each other above 127. If
you can find out which of those your app needs to output, it should be
easy to find the appropriate Encoding to give to your StreamWriter.

--
Jon Skeet - <[email protected]>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Ok. Is it ASCII-8 then?

Well, I solved it by using this code:

char[] test = mailContent.ToCharArray();

System.Text.Encoding enc = System.Text.Encoding.GetEncoding("437");

System.Text.Encoder ence = enc.GetEncoder();

FileStream fsWriter = new FileStream(filePath + fileName,
System.IO.FileMode.Create);

byte[] bytes = new Byte[ence.GetByteCount(test,0, test.Length, true)];

ence.GetBytes(test, 0, test.Length, bytes, 0, true);

fsWriter.Write(bytes, 0, bytes.Length);



/Marre
 
[Please fix your newsreader to quote properly, btw - it's a pain to
reply when you follow-up in the way you have, especially including the
sig separator from my post]

MA said:
Ok. Is it ASCII-8 then?

Is what "ASCII-8", exactly? Can't say I've heard of that before.
Well, I solved it by using this code:

char[] test = mailContent.ToCharArray();

System.Text.Encoding enc = System.Text.Encoding.GetEncoding("437");

System.Text.Encoder ence = enc.GetEncoder();

FileStream fsWriter = new FileStream(filePath + fileName,
System.IO.FileMode.Create);

byte[] bytes = new Byte[ence.GetByteCount(test,0, test.Length, true)];

ence.GetBytes(test, 0, test.Length, bytes, 0, true);

fsWriter.Write(bytes, 0, bytes.Length);

Just using StreamReader with an encoding of Encoding.GetEncoding("437")
would be rather simpler. However, are you *really* sure it's code page
437? If you've got the wrong code page, you'll write out the wrong
characters sooner or later.
 
Is what "ASCII-8", exactly? Can't say I've heard of that before.
Well, as you might have noticed, I´m really a newbie here :) I looked at
this link:
http://homepage.cs.uri.edu/faculty/wolfe/book/Readings/R02 Ascii/completeASCII.htm
Just using StreamReader with an encoding of Encoding.GetEncoding("437")
would be rather simpler. However, are you *really* sure it's code page
437? If you've got the wrong code page, you'll write out the wrong
characters sooner or later.
Yes. According to those who have developed the app I´m 'talking' with, it´s
437.
It´s not that important if some character get wrong, but if it happends, I
will probably return to this newsgroup :)

Sorry for my quote-problem. Never thougt of that before.

/Marre
 
MA said:
Well, as you might have noticed, I´m really a newbie here :) I looked at
this link:
http://homepage.cs.uri.edu/faculty/wolfe/book/Readings/R02 Ascii/completeASCII.htm

Any page which claims ASCII has characters over 127 shouldn't be
trusted, I'm afraid...
Yes. According to those who have developed the app I´m 'talking' with, it´s
437.

Good. It's always nice when you don't have to guess :)
It´s not that important if some character get wrong, but if it happends, I
will probably return to this newsgroup :)
Righto.

Sorry for my quote-problem. Never thougt of that before.

No problem - thanks for fixing it.
 
Back
Top