Determining character code page/encoding programmaticlly

  • Thread starter Thread starter LP
  • Start date Start date
L

LP

I need to figure encoding or code page of a file programmatically. Also I
was asked to figure out what was the original encoding of different records
stored as Unicode in SQL Server table. So, these records can be outputted to
different files with original encodings.

Can it be done?

By the way, character encoding and code page are pretty much the same thing,
correct?

Thank you
 
LP said:
I need to figure encoding or code page of a file programmatically. Also I
was asked to figure out what was the original encoding of different records
stored as Unicode in SQL Server table. So, these records can be outputted to
different files with original encodings.

Can it be done?

No - for both problems, I'm afraid. Any UTF-8 file could also be a
Windows CP1252 file, for example. You can make a guess, but it's going
to be heuristic and could be wrong.

As for the SQL Server problem, the characters are just in Unicode -
there's no way of telling whether any one particular string was
originally stored as UTF-8, Unicode or anything else.
By the way, character encoding and code page are pretty much the same thing,
correct?

"Code page" is a type of encoding - in other words, each code page is
an encoding, but there are encodings which have no code page, I
believe.
 
Back
Top