StreamReader Encoding ansi problem

A

Armin Zingler

LucaJonny said:
Hi,
I've got a problem using StreamReader in VB.NET.
I try to read a txt file that contains extended characters and
theese are removed from the line that is being read.

I've read a lot of articles about ANSI encoding like this
http://support.microsoft.com/default.aspx?scid=kb;en-us;889835

but System.Text.Encoding.Default don't work!!

Any ideas??
Thanks, LucaJonny!


Are you sure the file is an ANSI encoded text file?


Armin
 
Z

zdrakec

Try setting the Encoding parameter of your StreamReader to "Default". I
mean, not the "default" which you get if you leave it blank, but the
actual word "Default". If you system default encoding (mine is, for
example, Western European (Windows)) can handle extended characters,
your problem is solved.
At least, that solved it for me.

Cheers,
zdrakec
 
Z

zdrakec

Ah, sorry, I see you are using "Default". I mean to say, of course,
check your system's default setting to make sure it can handle the
extended character set.

Sorry,

zdrakec
 
Z

zdrakec

Ah, sorry, I see you are using "Default". I mean to say, of course,
check your system's default setting to make sure it can handle the
extended character set.

Sorry,

zdrakec
 
L

LucaJonny

I don't understand!
if I open the file with NOTEPAD (in the open file dialog the ANSI
decoding is setting as default) it seem OK!

if I see in the system setting the international and language options,
there isn't an ANSI western Europe encoding, but only an ANSI central
Europe encoding.

this is the problem???
if so, why NOTEPAD (and other editor software) read well the file??
how I install the ANSI western Europe encoding??

Thanks, LucaJonny!!



zdrakec ha scritto:
 
A

Armin Zingler

LucaJonny said:
Hi,
I've got a problem using StreamReader in VB.NET.
I try to read a txt file that contains extended characters and
theese are removed from the line that is being read.

I've read a lot of articles about ANSI encoding like this
http://support.microsoft.com/default.aspx?scid=kb;en-us;889835

but System.Text.Encoding.Default don't work!!

Any ideas??
Thanks, LucaJonny!


What do you mean with "extended" characters? Which char codes? Can you show
us the code how you read the file? How do you see that there are "removed"
characters?


Armin
 
L

LucaJonny

as "extended" characters I mean:

ù
°
½

à


this is my code:

Dim srStreamReader As StreamReader
Dim sFile As String = "C:\Temp\MyText.txt"
Dim sFileText As String

srStreamReader = New StreamReader(sFile, Encoding.Default)
sFileText = srStreamReader.ReadToEnd()
srStreamReader.Close()

Dim swStreamWriter As TextWriter = New StreamWriter(sFile & ".new")
swStreamWriter.Write(sFileText)
swStreamWriter.Close()

C:\Temp\MyText.txt seem like:

This is my text file.
In Italian "città" means city
The european currency is EURO (€)
.....
......
.......

C:\Temp\MyText.txt.new seem like:

This is my text file.
In Italian "città " means city
The european currency is EURO (â,¬)
.....
......
.......

Thanks




Armin Zingler ha scritto:
 
A

Armin Zingler

LucaJonny said:
as "extended" characters I mean:

ù
°
½
?
à


this is my code:

Dim srStreamReader As StreamReader
Dim sFile As String = "C:\Temp\MyText.txt"
Dim sFileText As String

srStreamReader = New StreamReader(sFile, Encoding.Default)
sFileText = srStreamReader.ReadToEnd()
srStreamReader.Close()

Dim swStreamWriter As TextWriter = New StreamWriter(sFile & ".new")
swStreamWriter.Write(sFileText)
swStreamWriter.Close()

C:\Temp\MyText.txt seem like:

This is my text file.
In Italian "città" means city
The european currency is EURO (?)
....
.....
......

C:\Temp\MyText.txt.new seem like:

This is my text file.
In Italian "città " means city
The european currency is EURO (â,¬)
....
.....
......

Thanks


You wrote that the file that you *read* does not contain the extended
characters. I still don't see how you check this. If you read sFileText from
the stream and display it in a the debug window or in a messagebox or a
textbox, are the characters there? If they are there, there is no problem.
As you don't speciy the 'Default' encoding when *writing* the file,
'mytext.txt.new' is UTF-8 encoded.


Armin
 
Z

zdrakec

Hey Luca:

Try this:

Dim inputString As String = ""
Dim byteArray() As Byte
Dim inputStream As FileStream
inputStream = New FileStream("pathtosomefile", FileMode.Open,
FileAccess.Read)
ReDim byteArray(CInt(inputStream.Length) - 1)
inputStream.Read(byteArray, 0, CInt(inputStream.Length) - 1)
inputStream.Close()
Dim x As Integer
For x = LBound(byteArray) To UBound(byteArray)
inputString &= Chr(byteArray(x))
Next 'x

You will find that "inputString" has the characters that StreamReader
disregarded.

Hope this is helpful,
zdrakec
 
J

Jay B. Harlow [MVP - Outlook]

LucaJonny,
| srStreamReader = New StreamReader(sFile, Encoding.Default)
| sFileText = srStreamReader.ReadToEnd()

The sFileText variable more then likely contains your extended characters,
as Armin suggests, use the Debug window in VS to verify.

| Dim swStreamWriter As TextWriter = New StreamWriter(sFile & ".new")
| swStreamWriter.Write(sFileText)
Ah! There's the Rub!!

You wrote the file in UTF-8 instead of ANSI, all your "extended characters"
now have a different encoding.

If you want to write in your Ansi encoding as defined by your Windows
Control Panel, you need to include Encoding.Default on the StreamWriter
also.

Dim swStreamWriter As TextWriter = New StreamWriter(sfile & ".new",
False, Encoding.Default)
swStreamWriter.Write(sFileText)


Remember both StreamReader & StreamWriter default to UTF-8 encoding, instead
of the Ansi encoding defined by the regional settings under your Windows
Control Panel.

Hope this helps
Jay


as "extended" characters I mean:

ù
°
½
?
à


this is my code:

Dim srStreamReader As StreamReader
Dim sFile As String = "C:\Temp\MyText.txt"
Dim sFileText As String

srStreamReader = New StreamReader(sFile, Encoding.Default)
sFileText = srStreamReader.ReadToEnd()
srStreamReader.Close()

Dim swStreamWriter As TextWriter = New StreamWriter(sFile & ".new")
swStreamWriter.Write(sFileText)
swStreamWriter.Close()

C:\Temp\MyText.txt seem like:

This is my text file.
In Italian "città" means city
The european currency is EURO (?)
.....
......
.......

C:\Temp\MyText.txt.new seem like:

This is my text file.
In Italian "città " means city
The european currency is EURO (â,¬)
.....
......
.......

Thanks




Armin Zingler ha scritto:
 
L

LucaJonny

I'm so stupid,
I read in ANSI and wrote in UTF-8!!!!

Now it's all right!!
Thanks All!


Jay B. Harlow [MVP - Outlook] ha scritto:
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top