unicode textbox problem

T

tparks69

OK I have some Chinese text in sql server column that looks like this:
12大专题调研破解广东科学å‘展难题

This is unicode? Anyway, I put this data into a text area like this:
articleArea.InnerHtml = article.Text . . ..

and it works fine (shows chinese characters). But when I put this data into
a asp:textbox control, it just shows up as is... (12大&# etc...)

Can anyone tell me how to get the characters to appear correctly in the text
box?
 
J

Jon Skeet [C# MVP]

tparks69 said:
OK I have some Chinese text in sql server column that looks like this:
12???????????????

This is unicode? Anyway, I put this data into a text area like this:
articleArea.InnerHtml = article.Text . . ..

and it works fine (shows chinese characters). But when I put this data into
a asp:textbox control, it just shows up as is... (12?&# etc...)

Can anyone tell me how to get the characters to appear correctly in the text
box?

Chances are it's because the text box is using a different font - and
one which doesn't contain those characters. I don't know if you can
specify fonts for text boxes, but that would be my line of enquiry if I
were you.
 
T

tparks69

I have the font specified as: style="font-family:Arial Unicode MS" but that
doesn't help. Also I found some code that works but the source data is in a
slightly different format than how I am getting it from SQL server:

So this works:

char[] chinese = {'\u6B22','\u8FCE','\u4F7F','\u7528','\u0020'};
txtHeadline.Text = new string(chinese);

text shows up as chinese characters in the textbox... but my data from SQL
server is in a different format... like this...

专 题 etc... what is this format, how can I convert it to '\u6B22'
etc...???
 
J

Jon Skeet [C# MVP]

tparks69 said:
I have the font specified as: style="font-family:Arial Unicode MS" but that
doesn't help. Also I found some code that works but the source data is in a
slightly different format than how I am getting it from SQL server:

So this works:

char[] chinese = {'\u6B22','\u8FCE','\u4F7F','\u7528','\u0020'};
txtHeadline.Text = new string(chinese);

text shows up as chinese characters in the textbox... but my data from SQL
server is in a different format... like this...

? ? etc... what is this format, how can I convert it to '\u6B22'
etc...???

Okay, so if the problem is getting the data in the first place, read
this:

http://pobox.com/~skeet/csharp/debuggingunicode.html
 
M

Mihai N.

Try adding
<globalization requestEncoding="utf-8" responseEncoding="utf-8" />
or
<%@ CODEPAGE = 65001 %>
<% Session.CodePage = 65001 %>
<% Response.CodePage = 65001 %>

(not sure what version you are using)
Also, <%@ CODEPAGE %> will also set Session/Response CodePage,
so there is no need to do it explicitely (just make sure you don't
call them with another value)

If the page has a header and a meta, make sure is also set to utf-8
a asp:textbox control, it just shows up as is... (12大&# etc...)
This tells us the encoding is messed-up, not the font.
A bad font would give you squares (missing glyph)
 
T

tparks69

Jon, thanks for you replies. I looked at your links... there is a lot of
info there, I will continue to study this, but it hasn't really revealed any
solutions said:
Okay, so if the problem is getting the data in the first place, read
this:

The problem isn't getting the data... I'm getting data from sql server in
some encoded format, but I don't know what it is... it starts with an
ampersand '&', then there is a '#' pound sign, then a 5 digit integer, then a
semi colon. I notice it gets translated when I post it exactly, so I'm
trying to descibe it here. If I put some spaces between it, a single
character looks like this (remove spaces):

& # 35843 ;

First of all, is this unicode?

Secondly, this is what I know about the problem:

1. I get the data in the above format from SQL server.
2. On an ASP.NET 2.0 webpage, if I move this data into a the innerHtml
property of a div tag for example, I see Chinese characters as expected, for
example:

articleArea.InnerHtml = article.NativeHeadline; //this works great!!!

3. If I move the same data into an asp:textbox, or an html input textbox, I
get the data exactly as it looks in SQL server with the ampersand, pound,
number, semicolon format. But I need this to show the Chinese characters.

4. Interstingly, if I cut and paste from the DIV in step 2, to the textbox,
the textbox shows the Chinese characters just fine... but this does not solve
the problem of course, just an interesting note.

5. I did try changing the font to MS Arial Unicode also didn't work

6. Another interesting note, a gridview column will display the data just
fine too!

7. Is there some conversion I can do in c# server side to get it to work?
I know for example, that if the data is in this format:

char[] chinese = {'\u6B22','\u8FCE','\u4F7F','\u7528','\u0020'};

and I move this to the textbox, the data also shows as expected. But how to
get the ampersand, pound, number, semicolon translated to the above?

Any other ideas for me are appreciated, I just need to know how to get the
textbox control to show Chinese and not weird codes.

Thanks

Jon Skeet said:
tparks69 said:
I have the font specified as: style="font-family:Arial Unicode MS" but that
doesn't help. Also I found some code that works but the source data is in a
slightly different format than how I am getting it from SQL server:

So this works:

char[] chinese = {'\u6B22','\u8FCE','\u4F7F','\u7528','\u0020'};
txtHeadline.Text = new string(chinese);

text shows up as chinese characters in the textbox... but my data from SQL
server is in a different format... like this...

? ? etc... what is this format, how can I convert it to '\u6B22'
etc...???

Okay, so if the problem is getting the data in the first place, read
this:

http://pobox.com/~skeet/csharp/debuggingunicode.html

--
Jon Skeet - <[email protected]>
Web site: http://www.pobox.com/~skeet
Blog: http://www.msmvps.com/jon.skeet
C# in Depth: http://csharpindepth.com
 
J

Jon Skeet [C# MVP]

tparks69 said:
Jon, thanks for you replies. I looked at your links... there is a lot of
info there, I will continue to study this, but it hasn't really revealed any


The problem isn't getting the data... I'm getting data from sql server in
some encoded format, but I don't know what it is... it starts with an
ampersand '&', then there is a '#' pound sign, then a 5 digit integer, then a
semi colon.

How are you viewing that? My guess is that you've got it in XML
somewhere - those sound like XML character entities.

I strongly recommend that you write a small *console* application which
fetches the string and dumps them out as
I notice it gets translated when I post it exactly, so I'm
trying to descibe it here. If I put some spaces between it, a single
character looks like this (remove spaces):

& # 35843 ;

First of all, is this unicode?

It's an XML character reference.
Secondly, this is what I know about the problem:

1. I get the data in the above format from SQL server.

Is the data meant to contain XML, or are you asking for it in XML?
2. On an ASP.NET 2.0 webpage, if I move this data into a the innerHtml
property of a div tag for example, I see Chinese characters as expected, for
example:

articleArea.InnerHtml = article.NativeHeadline; //this works great!!!

3. If I move the same data into an asp:textbox, or an html input textbox, I
get the data exactly as it looks in SQL server with the ampersand, pound,
number, semicolon format. But I need this to show the Chinese characters.

The difference is that when you specify the InnerHtml property, you're
specifying HTML, which includes the character references.

Put it this way: if your string were actually "<b>hello</b>" then by
specifying it as InnerHtml you'd get "hello" in bold, whereas as the
text for a textbox you'd get said:
4. Interstingly, if I cut and paste from the DIV in step 2, to the textbox,
the textbox shows the Chinese characters just fine... but this does not solve
the problem of course, just an interesting note.

5. I did try changing the font to MS Arial Unicode also didn't work

6. Another interesting note, a gridview column will display the data just
fine too!

7. Is there some conversion I can do in c# server side to get it to work?
I know for example, that if the data is in this format:

char[] chinese = {'\u6B22','\u8FCE','\u4F7F','\u7528','\u0020'};

and I move this to the textbox, the data also shows as expected. But how to
get the ampersand, pound, number, semicolon translated to the above?

You need to find out why you're getting the character reference to
start with. Where did the data come from? Is it meant to be plaintext,
or XML already?
 
T

tparks69

I figured out the solution. I put a HttpUtility.HtmlDecode on the string
coming from the DB and got the correct output.

The data did come from XML before being put in the DB, and probably a web
page before that, so it must have been HTML encoded at some point I'm
guessing.


Jon Skeet said:
tparks69 said:
Jon, thanks for you replies. I looked at your links... there is a lot of
info there, I will continue to study this, but it hasn't really revealed any


The problem isn't getting the data... I'm getting data from sql server in
some encoded format, but I don't know what it is... it starts with an
ampersand '&', then there is a '#' pound sign, then a 5 digit integer, then a
semi colon.

How are you viewing that? My guess is that you've got it in XML
somewhere - those sound like XML character entities.

I strongly recommend that you write a small *console* application which
fetches the string and dumps them out as
I notice it gets translated when I post it exactly, so I'm
trying to descibe it here. If I put some spaces between it, a single
character looks like this (remove spaces):

& # 35843 ;

First of all, is this unicode?

It's an XML character reference.
Secondly, this is what I know about the problem:

1. I get the data in the above format from SQL server.

Is the data meant to contain XML, or are you asking for it in XML?
2. On an ASP.NET 2.0 webpage, if I move this data into a the innerHtml
property of a div tag for example, I see Chinese characters as expected, for
example:

articleArea.InnerHtml = article.NativeHeadline; //this works great!!!

3. If I move the same data into an asp:textbox, or an html input textbox, I
get the data exactly as it looks in SQL server with the ampersand, pound,
number, semicolon format. But I need this to show the Chinese characters.

The difference is that when you specify the InnerHtml property, you're
specifying HTML, which includes the character references.

Put it this way: if your string were actually "<b>hello</b>" then by
specifying it as InnerHtml you'd get "hello" in bold, whereas as the
text for a textbox you'd get said:
4. Interstingly, if I cut and paste from the DIV in step 2, to the textbox,
the textbox shows the Chinese characters just fine... but this does not solve
the problem of course, just an interesting note.

5. I did try changing the font to MS Arial Unicode also didn't work

6. Another interesting note, a gridview column will display the data just
fine too!

7. Is there some conversion I can do in c# server side to get it to work?
I know for example, that if the data is in this format:

char[] chinese = {'\u6B22','\u8FCE','\u4F7F','\u7528','\u0020'};

and I move this to the textbox, the data also shows as expected. But how to
get the ampersand, pound, number, semicolon translated to the above?

You need to find out why you're getting the character reference to
start with. Where did the data come from? Is it meant to be plaintext,
or XML already?

--
Jon Skeet - <[email protected]>
Web site: http://www.pobox.com/~skeet
Blog: http://www.msmvps.com/jon.skeet
C# in Depth: http://csharpindepth.com
 
S

Stepanus David Kurniawan

Hi all,

I wanted to display Unicode characters on some controls using C# (I use
Ms Visual C# 2005 Express Edition).

I've found problem when I wanted to display the Unicode character in
controls beside rich text box (such as label, button, toolstripbutton,
etc).

The Unicode can be displayed correctly using rich text box. However in
other controls many characters are shown in as "box".

E.g. the display of Unicode with hex value 0100 8888 in button will be
"Ā" and box, where in rich text box is "Ā袈" (correct).

Anyone can help me on this? I appreciate any help...

Thanks and Regards,
David
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Top