utf8-encoding

B

beachboy

Hello all,

I am building a CMS which has 2 language: English & Traditional Chinese
my problem is all data are represent as "?????????", all pagecode are set to
utf8

do I need to encoding(-> utf8) before insert into DB?
OR do I need to do anything when content display?

Thanks in advanced.
 
J

Jon Skeet [C# MVP]

beachboy said:
I am building a CMS which has 2 language: English & Traditional Chinese
my problem is all data are represent as "?????????", all pagecode are set to
utf8

do I need to encoding(-> utf8) before insert into DB?
OR do I need to do anything when content display?

You'll need to give more information about your situation, I'm afraid.
Describe the different layers, whether it's a webapp or a Windows Forms
app etc.

You shouldn't need to do any work before putting a string into a
database though, so long as the database (and the column in question)
supports Unicode.

See http://www.pobox.com/~skeet/csharp/debuggingunicode.html for more
information on how to diagnose problems.

Jon
 
?

=?iso-8859-1?Q?Lasse=20V=e5gs=e6ther=20Karlsen?=

Hello all,
I am building a CMS which has 2 language: English & Traditional
Chinese
my problem is all data are represent as "?????????", all pagecode are
set to
utf8
do I need to encoding(-> utf8) before insert into DB? OR do I need to
do anything when content display?

Thanks in advanced.

Additionally to what Jon listed, information about how you display those
????? values would be useful. All too often I see long discussions about
MS SQL Server not handling Unicode and then finally it turns out they are
using Enterprise Manager to look at the data, and EM doesn't handle non-english
character sets very well. The data might very well be ok though.
 
B

beachboy

Thanks all advise.

My project is ASP.NET c# project,
I use simple form to input the content into database
and use simple query from database, and then "?????" displayed on my
webpage.
my page encoding set to utf-8 defaultly.

Is much enough? Thanks in advanced.
 
J

Jon Skeet [C# MVP]

beachboy said:
Thanks all advise.

My project is ASP.NET c# project,
I use simple form to input the content into database
and use simple query from database, and then "?????" displayed on my
webpage.
my page encoding set to utf-8 defaultly.

Is much enough? Thanks in advanced.

That's enough to start with. Now apply the techniques talked about in
the article I linked to before, and find out where the problem is. It
sounds like it shouldn't be too hard to find. Just log the contents of
the string (as Unicode numbers) when you store it in the database and
when you retrieve it. If they're the same, the problem is in the code
communicating the server and the web browser. If they're different, the
problem is in the code communicating with the database.

Jon
 
M

Mats-Lennart Hansson

Are you using nvarchar in the database? If not, the database cannot handle
these characters.
 
B

beachboy

Thanks.

use this method to convert unicode number?

static void DumpString (string value){ foreach (char c in value)
{
Console.Write ("{0:x4} ", (int)c);
}
Console.WriteLine();
}
 
B

beachboy

My project is ASP.NET c# project,
I use simple form to input the content into database
and use simple query from database, and then "?????" displayed on my
webpage.
my page encoding set to utf-8 defaultly.

Thanks in advanced.
 
J

Jon Skeet [C# MVP]

beachboy said:
use this method to convert unicode number?

static void DumpString (string value){ foreach (char c in value)
{
Console.Write ("{0:x4} ", (int)c);
}
Console.WriteLine();
}

Yes.
 
B

beachboy

Sorry. I am a beginner of CMS developer...
==========
private void Page_Load(object sender, System.EventArgs e)
{
// Put user code to initialize the page here
DumpString( "stan");
}



public void DumpString(string src)
{
StringBuilder sb = new StringBuilder();
foreach (char c in src)
{
Response.Write( "{0:x4}" , (int)c);
Response.Write("[]");
}
}

==========
but it has error on "Response.Write( "{0:x4}" , (int)c);"

Thank you!
 
J

Jon Skeet [C# MVP]

beachboy said:
Sorry. I am a beginner of CMS developer...
==========
private void Page_Load(object sender, System.EventArgs e)
{
// Put user code to initialize the page here
DumpString( "stan");
}



public void DumpString(string src)
{
StringBuilder sb = new StringBuilder();
foreach (char c in src)
{
Response.Write( "{0:x4}" , (int)c);
Response.Write("[]");
}
}

==========
but it has error on "Response.Write( "{0:x4}" , (int)c);"

Well, you can use Response.Write (string.Format("{0:x4}", (int)c));

However, another alternative is to write the string content to a log
file instead, so that you can still see what appears on the web page,
but separately find out what that consists of.

I'd look at the database side first though, as you can do that with a
console app very easily.

Jon
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top