Bug in StreamReader.ReadLine()? It reads special chars wrong...

V

VMI

When I execute a ReadLine from an ascii file with special chars (ie. the
'Ñ' in "NUÑEZ PEREZ"), it automatically deletes this character. So "NUÑEZ
PEREZ" becomes "NUEZ PEREZ". How can this be avoided? The reason being that
I compare this string to a string in an Access DB (btw, Access also screws
up with the character by replacing "Ñ" with "-"). So when I compare these
two strings, they won't match because both systems storing the data
interpret the char in different ways. The same thing happens with chars with
accents (ie. 'ó').

Thanks.
 
D

David Browne

VMI said:
When I execute a ReadLine from an ascii file with special chars (ie. the
'Ñ' in "NUÑEZ PEREZ"), it automatically deletes this character. So "NUÑEZ
PEREZ" becomes "NUEZ PEREZ". How can this be avoided? The reason being that
I compare this string to a string in an Access DB (btw, Access also screws
up with the character by replacing "Ñ" with "-"). So when I compare these
two strings, they won't match because both systems storing the data
interpret the char in different ways. The same thing happens with chars with
accents (ie. 'ó').

Thanks.
ASCII is a 7-bit encoding and has no 'Ñ'. In order to have that character,
your file must use an encoding other than ASCII. You must discover what
that encoding is and tell the StreamReader. By default it uses
UTF8Encoding.

Try
StreamReader sr = new
StreamReader("foo.txt",System.Text.Encoding.Default);

That will us your computer's regional settings to get the current code page,
or figure out what code page is used for your file and specify it.

David
 
V

VMI

Never mind.

I tried :
new StreamReader(filename, Encoding.Default);
and it worked fine with one of the special chars. I hope it works with all
of the strange chars.

Thanks anyway.
 
J

Joerg Jooss

VMI said:
Never mind.

I tried :
new StreamReader(filename, Encoding.Default);
and it worked fine with one of the special chars. I hope it works
with all of the strange chars.

Depends on what "all" are. In the end, Encoding.Default is just your
Windows' default (8 bit) code page. If you only deal with text files created
in that particular environment, you're safe. Otherwise you're better off
using Unicode.

Cheers,
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top