The short answer is that you can't alway determine the encoding from the
content of a file.
To see why, you can use Notepad to experiment with creating and saving text
as ANSI, Unicode, Unicode Big Endian, and UTF-8. Try pasting in some some
text from foreign web pages, as well as plain English text. Looking at the
files in a hex editor, like XVI32, you will see that for all but Ansi,
Notepad prepends a few bytes (called a Byte Order Mark) to indicate the type
of text file. For Unicode, it is the two byte sequence (hex) FFFE or FEFF,
to indicate either big endian or little endian unicode. Not all
applications prepend a BOM. Ansi and your two ISO encodings always use one
byte per character. Unicode always uses two bytes per character, except the
new Unicode-32 uses 4 bytes per character. UTF-8 uses a variable number of
bytes per character (one to five, I think), and can encode all two-byte
Unicode characters. For saving as Ansi, Notepad complains if all characters
can't be saved as one-byte characters.
-Paul Randall