StreamReader and encoding (making me crazy!)

M

MattB

Hi. I'm going around and around with an issue that I can't seem to get
around. I have a function I wrote that uses a StreamReader to read a
text file into a string variable. It's been working well for some time,
but we've discovered it doesn't work with foreign language files. We
have a text file in French, and the reader strips out characters like é.
That's no good for us. I'm thinking if I could make the reader use
Unicode instead of UTF-8 it might help, but I can;t seem to get that to
work.

My existing code is this:

Public Shared Function File2String(ByVal strFile)
'Open a file for reading
Dim strFilename As String = strFile
'Get a StreamReader class that can be used to read the file
Dim oReader As System.IO.StreamReader 'System.IO.StreamReader

'oReader.CurrentEncoding = System.IO.enc

Try
oReader = System.IO.File.OpenText(strFilename)
Catch ex As Exception
Return Nothing
End Try

'Dim test As String = objStreamReader.CurrentEncoding

Dim str As String = oReader.ReadToEnd.ToString
oReader.Close()
Return str
End Function
--------------

Some examples I've found specify the stream and encoding when the reader
is declared. So I've been trying to do that by moving the declaration to
inside the try/catch like this:
Try
Dim oReader As New
StreamReader(System.IO.File.OpenText(strFilename), Encoding.Unicode,
True, 1024)
Catch ex As Exception
Return Nothing
End Try

But Intellisense is balking at my definition (and many variations of it)
saying: Overload resolution failed because no accessible "New" can be
called with these arguments:...

I think I'm close but I can't seem to get it. Anyone got any tips or
examples that might work for me? I don;t even know if this is the right
way to solve this but I don't know of any other approaches.

Much thanks in advance for anything useful!

Matt
 
C

Cor Ligthert

Matt,

You will have no effect when your computer does not support the right
codeset. Countries in America, Australia (continents), Parts of Africa and
Western Europe share the same code set. (1252)

(It can be that the file is created with another code set than you would
suppose and than you are again in trouble).

However that is the first you have to check on your computer in the config.

With information in this link you should find your problem I think.

http://www.microsoft.com/globaldev/reference/oslocversion.mspx

http://www.geocities.com/Athens/Academy/4038/graph/fontset.htm#b

I hope this helps?

Cor
 
M

MattB

Cor said:
Matt,

You will have no effect when your computer does not support the right
codeset. Countries in America, Australia (continents), Parts of Africa and
Western Europe share the same code set. (1252)

(It can be that the file is created with another code set than you would
suppose and than you are again in trouble).

However that is the first you have to check on your computer in the config.

With information in this link you should find your problem I think.

http://www.microsoft.com/globaldev/reference/oslocversion.mspx

http://www.geocities.com/Athens/Academy/4038/graph/fontset.htm#b

I hope this helps?

Cor

Thanks for the reply!
Maybe I have the problem mis-diagnosed, because if I open the file in
Notepad, I see the characters. It's when I read the file into a vb.net
TextReader that I lose characters like é (it's an e with an accent).

Any ideas on clearing that up? My computer seems to be able to support
them, it's just when I read the files into my application things go wrong.

Matt
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top