Converting text and detecting encoding

F

Flix

Hello.

What I want to do is simple: correctly reading a text file whose encoding is
not known (it can be Ascii,UTF7,UTF8 or Unicode).

I'm thinking of something like that:

1) Read the text as Ascii:

string text="";
System.Text.Encoding encoding=System.Text.Encoding.ASCII;
using (StreamReader sr=new StreamReader(filePath,encoding))
{
text=sr.ReadToEnd();
}

2)Implement some kind of static methods like the following:

public static System.Text.Encoding GetEncodingFromText(string text)
{
[...]
}

3)Convert the string "text" into the correct encoding.


I got no idea on how to implement points 2 and 3.
Any suggestion is welcome.
 
B

Barry Kelly

Flix said:
What I want to do is simple: correctly reading a text file whose encoding is
not known (it can be Ascii,UTF7,UTF8 or Unicode).

It's not really that simple. Text can be UTF-16, in little-endian or
big-endian, without a BOM (byte order marker), for example. Check out
the IsTextUnicode() Win32 API - the functionality in Windows essentially
uses heuristics and guesses.

Check out Encoding.GetPreamble() in the docs for other possible clues.

-- Barry
 
F

Flix

Barry Kelly said:
It's not really that simple. Text can be UTF-16, in little-endian or
big-endian, without a BOM (byte order marker), for example. Check out
the IsTextUnicode() Win32 API - the functionality in Windows essentially
uses heuristics and guesses.

Check out Encoding.GetPreamble() in the docs for other possible clues.

-- Barry

Thank you for your reply.
 
J

Jon Skeet [C# MVP]

Flix said:
What I want to do is simple: correctly reading a text file whose encoding is
not known (it can be Ascii,UTF7,UTF8 or Unicode).

That's not simple - it's impossible. Every UTF7 file is a valid ASCII
file, for instance.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top