PC Review


Reply
Thread Tools Rate Thread

BUG in StreamWriter

 
 
Guest
Posts: n/a
 
      23rd Dec 2003
Hi,

When constructing StreamWriter with the following..
FileStream f = new FileStream(..);
StreamWriter s = new StreamWriter(f);

Then attempt to write out едц letters they become garbage.

BUT

If we call StreamWriter as follows...
FileStream f = new FileStream(..);
StreamWriter s = new StreamWriter(f, System.Text.Encoding.Default);

Its ok. So why is default not the actual DEFAULT as it says on the ctor?

It seems to me either the ctor is wrong or the name .Default is misleading.

Thanks.


 
Reply With Quote
 
 
 
 
Jon Skeet [C# MVP]
Guest
Posts: n/a
 
      23rd Dec 2003
<(E-Mail Removed)> wrote:
>
> When constructing StreamWriter with the following..
> FileStream f = new FileStream(..);
> StreamWriter s = new StreamWriter(f);
>
> Then attempt to write out едц letters they become garbage.
>
> BUT
>
> If we call StreamWriter as follows...
> FileStream f = new FileStream(..);
> StreamWriter s = new StreamWriter(f, System.Text.Encoding.Default);
>
> Its ok. So why is default not the actual DEFAULT as it says on the ctor?
>
> It seems to me either the ctor is wrong or the name .Default is misleading.


..Default is *slightly* misleading, although all the information is in
the documentation. The docs for new StreamWriter(Stream) say:

<quote>
This constructor creates a StreamWriter with UTF-8 encoding whose
GetPreamble method returns an empty byte array. The BaseStream property
is initialized using the stream parameter.
</quote>

However, the brief summary saying that it uses "the default" encoding
is misleading (I'll mail MS about it).

..Default means the default *platform* encoding - but pretty much
everything in .NET itself uses UTF-8 by default.

--
Jon Skeet - <(E-Mail Removed)>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
 
Reply With Quote
 
Jon Skeet [C# MVP]
Guest
Posts: n/a
 
      23rd Dec 2003
<(E-Mail Removed)> wrote:
> So UTF8 cant handle umlaut characters it seems then


Yes it can. It's just that whatever you were using to read the file
presumably wasn't aware that it was encoded in UTF-8.

--
Jon Skeet - <(E-Mail Removed)>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
 
Reply With Quote
 
Guest
Posts: n/a
 
      23rd Dec 2003
So UTF8 cant handle umlaut characters it seems then


"Jon Skeet [C# MVP]" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
<(E-Mail Removed)> wrote:
>
> When constructing StreamWriter with the following..
> FileStream f = new FileStream(..);
> StreamWriter s = new StreamWriter(f);
>
> Then attempt to write out едц letters they become garbage.
>
> BUT
>
> If we call StreamWriter as follows...
> FileStream f = new FileStream(..);
> StreamWriter s = new StreamWriter(f, System.Text.Encoding.Default);
>
> Its ok. So why is default not the actual DEFAULT as it says on the ctor?
>
> It seems to me either the ctor is wrong or the name .Default is

misleading.

..Default is *slightly* misleading, although all the information is in
the documentation. The docs for new StreamWriter(Stream) say:

<quote>
This constructor creates a StreamWriter with UTF-8 encoding whose
GetPreamble method returns an empty byte array. The BaseStream property
is initialized using the stream parameter.
</quote>

However, the brief summary saying that it uses "the default" encoding
is misleading (I'll mail MS about it).

..Default means the default *platform* encoding - but pretty much
everything in .NET itself uses UTF-8 by default.

--
Jon Skeet - <(E-Mail Removed)>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too


 
Reply With Quote
 
Guest
Posts: n/a
 
      23rd Dec 2003
According to windows file system it says ASCII

I thought that was standard enough Because I used the same format all the
way thru the code and its umlauted ok but when its writing (using the
default ctors) its garbled. I wiped the file, changed it to construct the
SR with Encoding.Default and its saving the umlat charset now, howcome the
usual ctor with FileStream doesnt save umlaut chars then as nowwhere else
did I specify any form of encoding until this change to fix it.



"Jon Skeet [C# MVP]" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> <(E-Mail Removed)> wrote:
> > So UTF8 cant handle umlaut characters it seems then

>
> Yes it can. It's just that whatever you were using to read the file
> presumably wasn't aware that it was encoded in UTF-8.
>
> --
> Jon Skeet - <(E-Mail Removed)>
> http://www.pobox.com/~skeet
> If replying to the group, please do not mail me too



 
Reply With Quote
 
Guest
Posts: n/a
 
      23rd Dec 2003
<?xml version="1.0" encoding="utf-8"?>

was even defined in the XML file that I got the string from, its even stored
in the String type correctly its just when writing to the file.

Normal calls specified WITHOUT encoding parameters did NOT save the umlaut
chars.



"Jon Skeet [C# MVP]" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> <(E-Mail Removed)> wrote:
> > So UTF8 cant handle umlaut characters it seems then

>
> Yes it can. It's just that whatever you were using to read the file
> presumably wasn't aware that it was encoded in UTF-8.
>
> --
> Jon Skeet - <(E-Mail Removed)>
> http://www.pobox.com/~skeet
> If replying to the group, please do not mail me too



 
Reply With Quote
 
Guest
Posts: n/a
 
      23rd Dec 2003
Opening the text file in notepad and selecting save as shows its ANSI, not
UTF8- how come the file create when appending does not store the file as
UTF8 then as thats suppost to be the default that you state?

That would cause the mixmatch if the file create is creating as ANSI and all
methods default to UTF8.




<(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> <?xml version="1.0" encoding="utf-8"?>
>
> was even defined in the XML file that I got the string from, its even

stored
> in the String type correctly its just when writing to the file.
>
> Normal calls specified WITHOUT encoding parameters did NOT save the umlaut
> chars.
>
>
>
> "Jon Skeet [C# MVP]" <(E-Mail Removed)> wrote in message
> news:(E-Mail Removed)...
> > <(E-Mail Removed)> wrote:
> > > So UTF8 cant handle umlaut characters it seems then

> >
> > Yes it can. It's just that whatever you were using to read the file
> > presumably wasn't aware that it was encoded in UTF-8.
> >
> > --
> > Jon Skeet - <(E-Mail Removed)>
> > http://www.pobox.com/~skeet
> > If replying to the group, please do not mail me too

>
>



 
Reply With Quote
 
Jon Skeet [C# MVP]
Guest
Posts: n/a
 
      23rd Dec 2003
<(E-Mail Removed)> wrote:
> According to windows file system it says ASCII


What do you mean by "according to the Windows file system"?

> I thought that was standard enough


ASCII doesn't have any characters with accents.

> Because I used the same format all the
> way thru the code and its umlauted ok but when its writing (using the
> default ctors) its garbled. I wiped the file, changed it to construct the
> SR with Encoding.Default and its saving the umlat charset now, howcome the
> usual ctor with FileStream doesnt save umlaut chars then as nowwhere else
> did I specify any form of encoding until this change to fix it.


It *does* save umlaut characters, it's just that what you're using to
read the file isn't recognising that it's UTF-8. You later say:

> Opening the text file in notepad and selecting save as shows its ANSI,
> not UTF8


That's just notepad being confused.

UTF-8 works fine, the framework works fine - but some of your tools may
not be doing what you want them to.

See http://www.pobox.com/~skeet/csharp/unicode.html for more
information about encodings.

--
Jon Skeet - <(E-Mail Removed)>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
 
Reply With Quote
 
Guest
Posts: n/a
 
      23rd Dec 2003
You're right, because notepad isnt standard at all for reading text files.
Nobody in theyre right mind uses it or Wintail etc to view logs. No no not
at all

Its fine when i specify Encoding.Default on StreamWriter yet its NOT when I
dont specify ANY encoding anywhere in the app.





"Jon Skeet [C# MVP]" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> <(E-Mail Removed)> wrote:
> > According to windows file system it says ASCII

>
> What do you mean by "according to the Windows file system"?
>
> > I thought that was standard enough

>
> ASCII doesn't have any characters with accents.
>
> > Because I used the same format all the
> > way thru the code and its umlauted ok but when its writing (using the
> > default ctors) its garbled. I wiped the file, changed it to construct

the
> > SR with Encoding.Default and its saving the umlat charset now, howcome

the
> > usual ctor with FileStream doesnt save umlaut chars then as nowwhere

else
> > did I specify any form of encoding until this change to fix it.

>
> It *does* save umlaut characters, it's just that what you're using to
> read the file isn't recognising that it's UTF-8. You later say:
>
> > Opening the text file in notepad and selecting save as shows its ANSI,
> > not UTF8

>
> That's just notepad being confused.
>
> UTF-8 works fine, the framework works fine - but some of your tools may
> not be doing what you want them to.
>
> See http://www.pobox.com/~skeet/csharp/unicode.html for more
> information about encodings.
>
> --
> Jon Skeet - <(E-Mail Removed)>
> http://www.pobox.com/~skeet
> If replying to the group, please do not mail me too



 
Reply With Quote
 
Frans Bouma
Guest
Posts: n/a
 
      23rd Dec 2003
Jon Skeet [C# MVP] <(E-Mail Removed)> wrote in
news:(E-Mail Removed):
> <(E-Mail Removed)> wrote:
>> Because I used the same format all the
>> way thru the code and its umlauted ok but when its writing (using the
>> default ctors) its garbled. I wiped the file, changed it to construct
>> the SR with Encoding.Default and its saving the umlat charset now,
>> howcome the usual ctor with FileStream doesnt save umlaut chars then as
>> nowwhere else did I specify any form of encoding until this change to
>> fix it.

>
> It *does* save umlaut characters, it's just that what you're using to
> read the file isn't recognising that it's UTF-8. You later say:


The byte specification in the actual raw data misses UTF-8
specification when you use Default. I was bitten by the same thing. I had
to explicitly state Encoding.Unicode. WHen I used Encoding.Default, it
should work according to the docs, but it didn't. It did save stuff like
scandinavian characters away in the file, but it couldn't read it back
correctly, even if I stated UTF-8 as encoding or whatever in the xml
header. So I think he's right.

>> Opening the text file in notepad and selecting save as shows its ANSI,
>> not UTF8

>
> That's just notepad being confused.
> UTF-8 works fine, the framework works fine - but some of your tools may
> not be doing what you want them to.


If you specify Encoding.Unicode, it will work, if you specify
Encoding.Default it will not in some cases. In both cases, the files do
NOT have an XML heading explaining the encoding. The actual encoding is in
the bytes in the file (and probably in a meta-data property in NTFS). That
specification is not read back / or written correctly when you use
Default. I think that's the reason for his complaint and I have to admit,
he's right, I had exactly the same thing.

Frans

--
Get LLBLGen Pro, the new O/R mapper for .NET: http://www.llblgen.com
 
Reply With Quote
 
 
 
Reply

Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
difference in Filestream + StreamWriter and just StreamWriter =?Utf-8?B?aXdkdTE1?= Microsoft VB .NET 1 2nd Aug 2006 01:40 PM
if System.IO.StreamWriter write throws an exception, is there anyway to close the System.IO.StreamWriter object? it seems to stay open when this happens then future attempts to write to that same path fail because it says its in use by another proces Daniel Microsoft Dot NET 3 8th Sep 2005 06:47 PM
if System.IO.StreamWriter write throws an exception, is there anyway to close the System.IO.StreamWriter object? it seems to stay open when this happens then future attempts to write to that same path fail because it says its in use by another proces Daniel Microsoft Dot NET Framework 0 8th Sep 2005 01:40 AM
ASP.NET C# WriteToFile StreamWriter error: 'StreamWriter' could not be found rex64 Microsoft C# .NET 4 15th Jun 2005 03:46 PM
capture using(StreamWriter wr = new StreamWriter()) exceptional kids_pro Microsoft C# .NET 6 3rd Sep 2004 07:53 PM


Features
 

Advertising
 

Newsgroups
 


All times are GMT +1. The time now is 10:17 AM.