PC Review


Reply
Thread Tools Rate Thread

Character encoding problem?

 
 
CMan
Guest
Posts: n/a
 
      25th May 2004
Hi,

I am reading a text file using a StreamReader in C# but the reader is unable
to handle some of the characheters.

Using the default encoding the program cannot handle accented characters. I
tried opening the file using other encodings e.g. UTF7.
UTF7 fixed the accents but cannot hadle the plus sign x2B. I am also having
problems with the Euro symbol x80 and quote x92.

How do I read this file correctly?
Is this problem caused by the encoding?
Is there a way to determine the file's encoding at runtime?
How else can I find out the encoding?

Thanks

Colin



 
Reply With Quote
 
 
 
 
Jon Skeet [C# MVP]
Guest
Posts: n/a
 
      25th May 2004
CMan <(E-Mail Removed)> wrote:
> I am reading a text file using a StreamReader in C# but the reader is unable
> to handle some of the characheters.
>
> Using the default encoding the program cannot handle accented characters. I
> tried opening the file using other encodings e.g. UTF7.
> UTF7 fixed the accents but cannot hadle the plus sign x2B. I am also having
> problems with the Euro symbol x80 and quote x92.


If your file is using 0x80 for the Euro symbol, that should help to
narrow it down...

Have you tried using Encoding.Default? That's not the same as the
default encoding for StreamReader when you don't specify an encoding.
(It's the default encoding for your computer, instead.)

> How do I read this file correctly?


By specifying the correct encoding.

> Is this problem caused by the encoding?


Almost certainly.

> Is there a way to determine the file's encoding at runtime?


There are ways you can try to guess it heuristically, but nothing
foolproof.

> How else can I find out the encoding?


Well, what generated this text file to start with?

See http://www.pobox.com/~skeet/csharp/unicode.html for more
information.

--
Jon Skeet - <(E-Mail Removed)>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
 
Reply With Quote
 
CMan
Guest
Posts: n/a
 
      25th May 2004
Thanks Jon.

You got in one. Encoding.Default fixed on my my machine.

Should have spotted that one.

Colin


"Jon Skeet [C# MVP]" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed)...
> CMan <(E-Mail Removed)> wrote:
> > I am reading a text file using a StreamReader in C# but the reader is

unable
> > to handle some of the characheters.
> >
> > Using the default encoding the program cannot handle accented

characters. I
> > tried opening the file using other encodings e.g. UTF7.
> > UTF7 fixed the accents but cannot hadle the plus sign x2B. I am also

having
> > problems with the Euro symbol x80 and quote x92.

>
> If your file is using 0x80 for the Euro symbol, that should help to
> narrow it down...
>
> Have you tried using Encoding.Default? That's not the same as the
> default encoding for StreamReader when you don't specify an encoding.
> (It's the default encoding for your computer, instead.)
>
> > How do I read this file correctly?

>
> By specifying the correct encoding.
>
> > Is this problem caused by the encoding?

>
> Almost certainly.
>
> > Is there a way to determine the file's encoding at runtime?

>
> There are ways you can try to guess it heuristically, but nothing
> foolproof.
>
> > How else can I find out the encoding?

>
> Well, what generated this text file to start with?
>
> See http://www.pobox.com/~skeet/csharp/unicode.html for more
> information.
>
> --
> Jon Skeet - <(E-Mail Removed)>
> http://www.pobox.com/~skeet
> If replying to the group, please do not mail me too



 
Reply With Quote
 
 
 
Reply

Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Problem with encoding a character David Microsoft C# .NET 5 16th Sep 2009 10:11 AM
Problem with Character Encoding joechu@gmail.com Microsoft C# .NET 3 24th Jun 2007 11:28 AM
Character (language) encoding problem in IE =?Utf-8?B?U3RldmUgQ2FtcGJlbGw=?= Windows XP Internet Explorer 0 30th Nov 2005 08:00 PM
ADO VS. ADO.NET character encoding problem Kivanc Toker Microsoft ADO .NET 1 20th Sep 2005 01:42 AM
Character encoding problem? =?Utf-8?B?VGhvbWFzIEthcmxzc29u?= Microsoft ASP .NET 0 2nd Feb 2004 09:01 AM


Features
 

Advertising
 

Newsgroups
 


All times are GMT +1. The time now is 05:52 PM.