Bug in StreamReader.ReadLine()? It reads special chars wrong...

  • Thread starter Thread starter VMI
  • Start date Start date
V

VMI

When I execute a ReadLine from an ascii file with special chars (ie. the
'Ñ' in "NUÑEZ PEREZ"), it automatically deletes this character. So "NUÑEZ
PEREZ" becomes "NUEZ PEREZ". How can this be avoided? The reason being that
I compare this string to a string in an Access DB (btw, Access also screws
up with the character by replacing "Ñ" with "-"). So when I compare these
two strings, they won't match because both systems storing the data
interpret the char in different ways. The same thing happens with chars with
accents (ie. 'ó').

Thanks.
 
VMI said:
When I execute a ReadLine from an ascii file with special chars (ie. the
'Ñ' in "NUÑEZ PEREZ"), it automatically deletes this character. So "NUÑEZ
PEREZ" becomes "NUEZ PEREZ". How can this be avoided? The reason being that
I compare this string to a string in an Access DB (btw, Access also screws
up with the character by replacing "Ñ" with "-"). So when I compare these
two strings, they won't match because both systems storing the data
interpret the char in different ways. The same thing happens with chars with
accents (ie. 'ó').

Thanks.
ASCII is a 7-bit encoding and has no 'Ñ'. In order to have that character,
your file must use an encoding other than ASCII. You must discover what
that encoding is and tell the StreamReader. By default it uses
UTF8Encoding.

Try
StreamReader sr = new
StreamReader("foo.txt",System.Text.Encoding.Default);

That will us your computer's regional settings to get the current code page,
or figure out what code page is used for your file and specify it.

David
 
Never mind.

I tried :
new StreamReader(filename, Encoding.Default);
and it worked fine with one of the special chars. I hope it works with all
of the strange chars.

Thanks anyway.
 
VMI said:
Never mind.

I tried :
new StreamReader(filename, Encoding.Default);
and it worked fine with one of the special chars. I hope it works
with all of the strange chars.

Depends on what "all" are. In the end, Encoding.Default is just your
Windows' default (8 bit) code page. If you only deal with text files created
in that particular environment, you're safe. Otherwise you're better off
using Unicode.

Cheers,
 
Back
Top