is unicode support in c# a fake?

G

Guest

Hi!
I don't know why, but I want to read a file, change some of the content, and
want to write this new content in another file. The problem is, that it
contains unicode text.

My code is:

System.IO.StreamReader reader =
new System.IO.StreamReader(this.openFileDialog1.FileName,Encoding.Unicode);

string new_filename = this.openFileDialog1.FileName.Replace(".txt",
"(###).txt");

System.IO.StreamWriter writer =
new System.IO.StreamWriter(new_filename, true, Encoding.Unicode);

string FILE = reader.ReadToEnd();
writer.Write(FILE);

writer.Close();
reader.Close();

I have not altered the content at all. This is the whole code from my
function private void openFileDialog1_FileOk(object sender, CancelEventArgs
e).

I hope you can help me. Please?
 
J

Jon Skeet [C# MVP]

Alexander said:
I don't know why, but I want to read a file, change some of the content, and
want to write this new content in another file. The problem is, that it
contains unicode text.

And that's fine, but you need to use the right encodings.
My code is:

System.IO.StreamReader reader =
new System.IO.StreamReader(this.openFileDialog1.FileName,Encoding.Unicode);

Is your text genuinely encoded with Unicode? Just because it contains
Unicode characters doesn't mean it's *encoded* as Unicode.

Where did you get your file from in the first place?

(Oh, and what happens when you run your code? You never specified.)
 
J

John Vottero

What's the problem? Are you saying that the file you write doesn't contain
Unicode?
 
G

Guest

Alexander said:
Hi!
I don't know why, but I want to read a file, change some of the content, and
want to write this new content in another file. The problem is, that it
contains unicode text.

How is that a problem?
My code is:

System.IO.StreamReader reader =
new System.IO.StreamReader(this.openFileDialog1.FileName,Encoding.Unicode);

The Encoding.Unicode gets an encoding for UTF-16. Text files are usually
saved as UTF-8, not UTF-16.

What encoding was used to save the text? UTF-7? UTF-8? UTF-16? UTF-32?
string new_filename = this.openFileDialog1.FileName.Replace(".txt",
"(###).txt");

System.IO.StreamWriter writer =
new System.IO.StreamWriter(new_filename, true, Encoding.Unicode);

string FILE = reader.ReadToEnd();
writer.Write(FILE);

writer.Close();
reader.Close();

I have not altered the content at all. This is the whole code from my
function private void openFileDialog1_FileOk(object sender, CancelEventArgs
e).

I hope you can help me. Please?

With what? It's hard to help you if you don't even tell what the problem
is...

If the problem is that the file is not read and written correctly, just
specify the correct encoding.
 
M

Maate

System.IO.StreamReader reader =
new System.IO.StreamReader(this.openFileDialog1.FileName,Encoding.Unicode);


The Encoding.Unicode setting in .NET refers to UTF-16LE. If this
doesn't work, your document is possibly encoded with a different UTF,
e.g. UTF-8 or UTF16BE called Encoding.BigEndian (or something like
that). Try it out, or post some HEX;-)

Morten
 
B

Ben Voigt [C++ MVP]

Jon Skeet said:
And that's fine, but you need to use the right encodings.


Is your text genuinely encoded with Unicode? Just because it contains
Unicode characters doesn't mean it's *encoded* as Unicode.

Where did you get your file from in the first place?

(Oh, and what happens when you run your code? You never specified.)

I'm going to take a wild guess and say a byte order mark got prefixed to the
output which the OP didn't expect.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top