NotSupportedException with charset conversions

G

Guest

I have a strange exception occuring in the CLR 1.1 framework.

I have an EUC-JP file.
If I convert it to SJIS using nkf, load it into SQL Server (using a text
data source), then use the CLR to output it to an EUC-JP web page, I have no
problem.

But if I convert the same EUC-JP web page to UTF-16 using iconv, load it
into SQL Server (using a text data source set to Unicode), then use the CLR
to output it to an EUC-JP web page, depending on the text, I occasionally get
a NotSupportedException. This is very strange to me, as (1) the text came
from EUC-JP originally and (2) the Kanji all looks OK.

Has anyone seen anything remotely like this? The stack trace is deep in
CLR, so it's impossible to tell where it's barfing, but here it is:

[NotSupportedException: The specified encoding could not complete the
requested conversion on this platform.]
System.Text.MLangCodePageEncoding.nativeUnicodeToBytes(Int32 codePage,
Char[] chars, Int32 charIndex, Int32 charCount, Byte[] bytes, Int32
byteIndex, Int32 byteCount) +0
System.Text.MLangEncoder.GetBytes(Char[] chars, Int32 charIndex, Int32
charCount, Byte[] bytes, Int32 byteIndex, Boolean flush) +118
System.Web.HttpWriter.FlushCharBuffer(Boolean flushEncoder) +213
System.Web.HttpWriter.Write(Char ch) +22
System.Web.UI.HtmlTextWriter.Write(Char value) +32
System.Xml.Xsl.TextOutput.Write(Char outputChar) +13
System.Xml.Xsl.SequentialOutput.WriteHtmlUri(String value) +487
System.Xml.Xsl.SequentialOutput.WriteAttributes(ArrayList list, Int32
count, HtmlElementProps htmlElementsProps) +291
System.Xml.Xsl.SequentialOutput.WriteStartElement(RecordBuilder record)
+301
System.Xml.Xsl.SequentialOutput.OutputRecord(RecordBuilder record) +54
System.Xml.Xsl.SequentialOutput.RecordDone(RecordBuilder record) +76
System.Xml.Xsl.RecordBuilder.CanOutput(Int32 state) +75
System.Xml.Xsl.RecordBuilder.EndEvent(Int32 state, XPathNodeType
nodeType) +14
System.Xml.Xsl.Processor.EndEvent(XPathNodeType nodeType) +88
System.Xml.Xsl.EndEvent.Output(Processor processor, ActionFrame frame) +13
System.Xml.Xsl.CopyCodeAction.Execute(Processor processor, ActionFrame
frame) +85
System.Xml.Xsl.ActionFrame.Execute(Processor processor) +24
System.Xml.Xsl.Processor.Execute() +78
System.Xml.Xsl.XslTransform.Transform(IXPathNavigable input,
XsltArgumentList args, TextWriter output) +75
GotDotNet.Exslt.ExsltTransform.Transform(IXPathNavigable ixn,
XsltArgumentList arglist, TextWriter writer) +122
Ranking.Ranking_Output_XSL_2.Render(HtmlTextWriter writer) in
c:\inetpub\wwwroot\Ranking\Ranking_Output_XSL_2.vb:222
System.Web.UI.Control.RenderControl(HtmlTextWriter writer) +130
System.Web.UI.Page.ProcessRequestMain() +1897

Regards,

Jonathan
 
P

Peter Huang

Hi,

From the description, it seems that the CLR can not convert the utf-16 to
EUC-JP.
So far to isolate the problem, I think we may just write a simple test file
and save it as utf-16(we can use notepad.exe to do the save job).
In the aspx code behind, we can just read the file(using StreamReader with
unicode encoding) and then write the string to the EUC-JP WebPage to see if
there is any error occur.

If there is no error occur, we can try to generate the test file with the
same content as the EUC-JP file which you will convert into utf-16 and save
to sql server, and then we can compare string from the test file and the
string from the sql server to see if there is any different.

You may have a try and let me know the result.

Best regards,

Peter Huang
Microsoft Online Partner Support

Get Secure! - www.microsoft.com/security
This posting is provided "AS IS" with no warranties, and confers no rights.
 
G

Guest

Peter,

Thanks for your reply.

"Peter Huang" said:
From the description, it seems that the CLR can not convert the utf-16 to
EUC-JP.

Of course, you're right. It took some time, but I was able to construct a
test case and determine the following:

It seems that both cygwin's iconv and .NET are partially to blame for this,
but .NET's problem is more severe.

First, iconv converts 0xA1DD (EUC Minus Sign) to 0x2212 (UTF-16 Minus Sign).
I am not an expert, but 0xA1DD seems to be an EUC Full Width Minus. It
seems that a better mapping would be to 0xFF0D (UTF-16 Full Width
Hyphen-Minus), but regardless, .NET ought be smart enough to convert both
Unicode 0xFF0D and Unicode 0x2212 to EUC 0xA1DD.

Unfortunately, as you can see from the test case below, .NET barfs when
tries to convert Unicode 0x2212 to EUC:

Public Class TestItemData_XSL
Inherits System.Web.UI.Page

#Region " Web Form Designer Generated Code "

'This call is required by the Web Form Designer.
<System.Diagnostics.DebuggerStepThrough()> Private Sub
InitializeComponent()
End Sub

'NOTE: The following placeholder declaration is required by the Web Form
Designer.
'Do not delete or move it.
Private designerPlaceholderDeclaration As System.Object

Private Sub Page_Init(ByVal sender As System.Object, ByVal e As
System.EventArgs) Handles MyBase.Init
'CODEGEN: This method call is required by the Web Form Designer
'Do not modify it using the code editor.
InitializeComponent()
End Sub

#End Region

Private Sub Page_Load(ByVal sender As System.Object, ByVal e As
System.EventArgs) Handles MyBase.Load
End Sub

Protected Overridable Function GetEncoding() As String
If (Request("Encoding") = "") Then
Return "euc-jp"
Else
Return Request("Encoding")
End If
End Function

Protected Overrides Sub Render(ByVal writer As
System.Web.UI.HtmlTextWriter)

Response.ContentEncoding = Encoding.GetEncoding(GetEncoding())

Response.ContentType = "text/xml"
writer.Write("<?xml version=""1.0"" encoding=""" + GetEncoding() +
"""?>")
' writer.Write(Test_Item_DataSet1.GetXml)
writer.Write("<Test_Data_Set>")
writer.Write("<Bad_Character>")
writer.Write(Microsoft.VisualBasic.ChrW(&H2212))
writer.Write("</Bad_Character>")
writer.Write("<Good_Character>")
writer.Write(Microsoft.VisualBasic.ChrW(&HFF0D))
writer.Write("</Good_Character>")
writer.Write("</Test_Data_Set>")
End Sub
End Class

When the Encoding param of the URL is set to utf-8, we can see:

<?xml version="1.0" encoding="utf-8"?>
<Test_Data_Set>
<Bad_Character>−</Bad_Character>
<Good_Character>ï¼</Good_Character>
</Test_Data_Set>

but when it is set to euc-jp, we get an exception:
[NotSupportedException: The specified encoding could not complete the
requested conversion on this platform.]
System.Text.MLangCodePageEncoding.nativeUnicodeToBytes(Int32 codePage,
Char[] chars, Int32 charIndex, Int32 charCount, Byte[] bytes, Int32
byteIndex, Int32 byteCount) +0
System.Text.MLangEncoder.GetBytes(Char[] chars, Int32 charIndex, Int32
charCount, Byte[] bytes, Int32 byteIndex, Boolean flush) +118
System.Web.HttpWriter.FlushCharBuffer(Boolean flushEncoder) +213
System.Web.HttpWriter.GetBufferedLength() +26
System.Web.UI.Control.RenderControl(HtmlTextWriter writer) +152
System.Web.UI.Page.ProcessRequestMain() +1897

Regards,

Jonathan

http://kerblog.com/earlyedition/archive/2004/10/07/209.aspx
 
G

Guest

I found a few more exceptions in converting from UTF-16 to EUC-JP:

0x301C (should probably convert to 0xA1C1, but instead generates the same
exception as for 0x2212)
0x2016 (should probably convert to 0xA1C2, but instead generates the same
exception as for 0x2212)

In addition, there seems to be at least one cosmetic issues:

0x00A6 (halfwidth broken bar) converts to 0x7c (vertical bar), should
probably convert to 0x8fa2c3 (broken bar)

but I'm guessing this is more a code page issue than a .NET issue.

Regards,

Jonathan

http://kerblog.com/earlyedition/archive/2004/10/12/210.aspx
 
P

Peter Huang

Hi

Now I am researching the issue and I will update you with new information
ASAP.,


Best regards,

Peter Huang
Microsoft Online Partner Support

Get Secure! - www.microsoft.com/security
This posting is provided "AS IS" with no warranties, and confers no rights.
 
P

Peter Huang

Hi,

I am sorry for delay reply.
Now I am involving related supporting team about the issue to see if there
is any new update.
Thank you for your understanding!

Best regards,

Peter Huang
Microsoft Online Partner Support

Get Secure! - www.microsoft.com/security
This posting is provided "AS IS" with no warranties, and confers no rights.
 
G

Guest

"Peter Huang" said:
I am sorry for delay reply.
Now I am involving related supporting team about the issue to see if there
is any new update.
Thank you for your understanding!
Hi Peter,

Any news about this?

Thanks,

Jonathan
 
E

Earl Beaman[MS]

Hi Jonathon,

I was able to repro with a winforms app as well as the aspx code. We will
get someone from the globalization team involved with the issue.

Thanks,
Earl Beaman
Microsoft, ASP.NET

This posting is provided "AS IS", with no warranties, and confers no
rights.
 
A

Alice Gu [MS]

Hi, Jonathan,

You are right. This is actually not a .net problem, but rather on how
MLANG converts UTF-16 to EUC-JP. I've reproduced the problem in a pure
mlang application ( no .net involved). I am checking with the folks who
wrote MLANG on this. Will update this group when I hear back from them.

Thank You,
Alice Gu
Microsoft Developer Support

This posting is provided "AS IS" with no warranties, and confers no rights.
Use of included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm


--------------------
| Thread-Topic: NotSupportedException with charset conversions
| thread-index: AcSw1aFlDhdbhvWtQuiKr89DYSx5XA==
| X-WBNR-Posting-Host: 211.128.107.244
| From: =?Utf-8?B?Sm9uYXRoYW4gTGV2aW5l?= <[email protected]>
| References: <[email protected]>
<T#[email protected]>
| Subject: RE: NotSupportedException with charset conversions
| Date: Tue, 12 Oct 2004 20:35:05 -0700
| Lines: 19
| Message-ID: <[email protected]>
| MIME-Version: 1.0
| Content-Type: text/plain;
| charset="Utf-8"
| Content-Transfer-Encoding: 7bit
| X-Newsreader: Microsoft CDO for Windows 2000
| Content-Class: urn:content-classes:message
| Importance: normal
| Priority: normal
| X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.0
| Newsgroups: microsoft.public.dotnet.framework
| NNTP-Posting-Host: TK2MSFTNGXA03.phx.gbl 10.40.1.29
| Path: cpmsftngxa10.phx.gbl!TK2MSFTNGXA03.phx.gbl
| Xref: cpmsftngxa10.phx.gbl microsoft.public.dotnet.framework:81925
| X-Tomcat-NG: microsoft.public.dotnet.framework
|
| I found a few more exceptions in converting from UTF-16 to EUC-JP:
|
| 0x301C (should probably convert to 0xA1C1, but instead generates the same
| exception as for 0x2212)
| 0x2016 (should probably convert to 0xA1C2, but instead generates the same
| exception as for 0x2212)
|
| In addition, there seems to be at least one cosmetic issues:
|
| 0x00A6 (halfwidth broken bar) converts to 0x7c (vertical bar), should
| probably convert to 0x8fa2c3 (broken bar)
|
| but I'm guessing this is more a code page issue than a .NET issue.
|
| Regards,
|
| Jonathan
|
| http://kerblog.com/earlyedition/archive/2004/10/12/210.aspx
|
 
A

Alice Gu [MS]

Hi, Jonathan,

It seems that this problem ( a few EUC-JP chars failed to roundtrip to
UTF-16 and back) is known. I am putting in your suggestion as a wish item
in the development team's list. They will consider it for future versions
of Windows.

Thank You,
Alice Gu
Microsoft Developer Support

This posting is provided "AS IS" with no warranties, and confers no rights.
Use of included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm


--------------------
| X-Tomcat-ID: 8721697
| References: <[email protected]>
<T#[email protected]>
<[email protected]>
| MIME-Version: 1.0
| Content-Type: text/plain
| Content-Transfer-Encoding: 7bit
| From: (e-mail address removed) (Alice Gu [MS])
| Organization: Microsoft
| Date: Sat, 23 Oct 2004 00:24:58 GMT
| Subject: RE: NotSupportedException with charset conversions
| X-Tomcat-NG: microsoft.public.dotnet.framework
| Message-ID: <Fiz9#[email protected]>
| Newsgroups: microsoft.public.dotnet.framework
| Lines: 59
| Path: cpmsftngxa10.phx.gbl
| Xref: cpmsftngxa10.phx.gbl microsoft.public.dotnet.framework:82719
| NNTP-Posting-Host: TOMCATIMPORT1 10.201.218.122
|
| Hi, Jonathan,
|
| You are right. This is actually not a .net problem, but rather on how
| MLANG converts UTF-16 to EUC-JP. I've reproduced the problem in a pure
| mlang application ( no .net involved). I am checking with the folks who
| wrote MLANG on this. Will update this group when I hear back from them.
|
| Thank You,
| Alice Gu
| Microsoft Developer Support
|
| This posting is provided "AS IS" with no warranties, and confers no
rights.
| Use of included script samples are subject to the terms specified at
| http://www.microsoft.com/info/cpyright.htm
|
|
| --------------------
| | Thread-Topic: NotSupportedException with charset conversions
| | thread-index: AcSw1aFlDhdbhvWtQuiKr89DYSx5XA==
| | X-WBNR-Posting-Host: 211.128.107.244
| | From: =?Utf-8?B?Sm9uYXRoYW4gTGV2aW5l?= <[email protected]>
| | References: <[email protected]>
| <T#[email protected]>
| | Subject: RE: NotSupportedException with charset conversions
| | Date: Tue, 12 Oct 2004 20:35:05 -0700
| | Lines: 19
| | Message-ID: <[email protected]>
| | MIME-Version: 1.0
| | Content-Type: text/plain;
| | charset="Utf-8"
| | Content-Transfer-Encoding: 7bit
| | X-Newsreader: Microsoft CDO for Windows 2000
| | Content-Class: urn:content-classes:message
| | Importance: normal
| | Priority: normal
| | X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.0
| | Newsgroups: microsoft.public.dotnet.framework
| | NNTP-Posting-Host: TK2MSFTNGXA03.phx.gbl 10.40.1.29
| | Path: cpmsftngxa10.phx.gbl!TK2MSFTNGXA03.phx.gbl
| | Xref: cpmsftngxa10.phx.gbl microsoft.public.dotnet.framework:81925
| | X-Tomcat-NG: microsoft.public.dotnet.framework
| |
| | I found a few more exceptions in converting from UTF-16 to EUC-JP:
| |
| | 0x301C (should probably convert to 0xA1C1, but instead generates the
same
| | exception as for 0x2212)
| | 0x2016 (should probably convert to 0xA1C2, but instead generates the
same
| | exception as for 0x2212)
| |
| | In addition, there seems to be at least one cosmetic issues:
| |
| | 0x00A6 (halfwidth broken bar) converts to 0x7c (vertical bar), should
| | probably convert to 0x8fa2c3 (broken bar)
| |
| | but I'm guessing this is more a code page issue than a .NET issue.
| |
| | Regards,
| |
| | Jonathan
| |
| | http://kerblog.com/earlyedition/archive/2004/10/12/210.aspx
| |
|
|
 
G

Guest

Hi Alice,

Thanks for your reply.

Alice Gu said:
It seems that this problem ( a few EUC-JP chars failed to roundtrip to
UTF-16 and back) is known. I am putting in your suggestion as a wish item
in the development team's list. They will consider it for future versions
of Windows.

I don't mind so much about the incorrect conversion of 0x00A6 to |, or that
the round-trip from EUC-JP -> UTF-16 -> EUC-JP results in minor changes to
the characters.

But this isn't just an issue of round-trip conversion being incorrect. Am I
being unreasonable to ask that .NET not throw an exception when trying to
convert fairly frequently used characters like ï¼or ~ into EUC-JP?

Thanks,

Jonathan

http://kerblog.com/earlyedition/archive/2004/10/27/225.aspx
 
G

Guest

Hi Alice. Did you hear anything more from either the internationalization
team (agreeing to fix the underlying bug) or the .NET team (agreeing to not
throw an exception when .NET encounters these characters)?

Thanks,

Jonathan

Alice Gu said:
Hi, Jonathan,

It seems that this problem ( a few EUC-JP chars failed to roundtrip to
UTF-16 and back) is known. I am putting in your suggestion as a wish item
in the development team's list. They will consider it for future versions
of Windows.

Thank You,
Alice Gu
Microsoft Developer Support

This posting is provided "AS IS" with no warranties, and confers no rights.
Use of included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm


--------------------
| X-Tomcat-ID: 8721697
| References: <[email protected]>
<T#[email protected]>
<[email protected]>
| MIME-Version: 1.0
| Content-Type: text/plain
| Content-Transfer-Encoding: 7bit
| From: (e-mail address removed) (Alice Gu [MS])
| Organization: Microsoft
| Date: Sat, 23 Oct 2004 00:24:58 GMT
| Subject: RE: NotSupportedException with charset conversions
| X-Tomcat-NG: microsoft.public.dotnet.framework
| Message-ID: <Fiz9#[email protected]>
| Newsgroups: microsoft.public.dotnet.framework
| Lines: 59
| Path: cpmsftngxa10.phx.gbl
| Xref: cpmsftngxa10.phx.gbl microsoft.public.dotnet.framework:82719
| NNTP-Posting-Host: TOMCATIMPORT1 10.201.218.122
|
| Hi, Jonathan,
|
| You are right. This is actually not a .net problem, but rather on how
| MLANG converts UTF-16 to EUC-JP. I've reproduced the problem in a pure
| mlang application ( no .net involved). I am checking with the folks who
| wrote MLANG on this. Will update this group when I hear back from them.
|
| Thank You,
| Alice Gu
| Microsoft Developer Support
|
| This posting is provided "AS IS" with no warranties, and confers no
rights.
| Use of included script samples are subject to the terms specified at
| http://www.microsoft.com/info/cpyright.htm
|
|
| --------------------
| | Thread-Topic: NotSupportedException with charset conversions
| | thread-index: AcSw1aFlDhdbhvWtQuiKr89DYSx5XA==
| | X-WBNR-Posting-Host: 211.128.107.244
| | From: =?Utf-8?B?Sm9uYXRoYW4gTGV2aW5l?= <[email protected]>
| | References: <[email protected]>
| <T#[email protected]>
| | Subject: RE: NotSupportedException with charset conversions
| | Date: Tue, 12 Oct 2004 20:35:05 -0700
| | Lines: 19
| | Message-ID: <[email protected]>
| | MIME-Version: 1.0
| | Content-Type: text/plain;
| | charset="Utf-8"
| | Content-Transfer-Encoding: 7bit
| | X-Newsreader: Microsoft CDO for Windows 2000
| | Content-Class: urn:content-classes:message
| | Importance: normal
| | Priority: normal
| | X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.0
| | Newsgroups: microsoft.public.dotnet.framework
| | NNTP-Posting-Host: TK2MSFTNGXA03.phx.gbl 10.40.1.29
| | Path: cpmsftngxa10.phx.gbl!TK2MSFTNGXA03.phx.gbl
| | Xref: cpmsftngxa10.phx.gbl microsoft.public.dotnet.framework:81925
| | X-Tomcat-NG: microsoft.public.dotnet.framework
| |
| | I found a few more exceptions in converting from UTF-16 to EUC-JP:
| |
| | 0x301C (should probably convert to 0xA1C1, but instead generates the
same
| | exception as for 0x2212)
| | 0x2016 (should probably convert to 0xA1C2, but instead generates the
same
| | exception as for 0x2212)
| |
| | In addition, there seems to be at least one cosmetic issues:
| |
| | 0x00A6 (halfwidth broken bar) converts to 0x7c (vertical bar), should
| | probably convert to 0x8fa2c3 (broken bar)
| |
| | but I'm guessing this is more a code page issue than a .NET issue.
| |
| | Regards,
| |
| | Jonathan
| |
| | http://kerblog.com/earlyedition/archive/2004/10/12/210.aspx
| |
|
|
 
A

Alice Gu [MS]

Hi, Jonathan,

There is not going to be a hotfix for MLANG on this. The team has taken it
into consideration for the next version of Windows, though.

Because this exception was actually caused by the error returned by MLANG,
.NET needed to throw the exception. Would it be possible for your code to
handle the exception, and keep going?

Thank You,
Alice Gu
Microsoft Developer Support

This posting is provided "AS IS" with no warranties, and confers no rights.
Use of included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm


--------------------
| Thread-Topic: NotSupportedException with charset conversions
| thread-index: AcS9dEkZf8wNInSfQg2Srdzo67NMZQ==
| X-WBNR-Posting-Host: 211.128.107.244
| From: =?Utf-8?B?Sm9uYXRoYW4gTGV2aW5l?= <[email protected]>
| References: <[email protected]>
<T#[email protected]>
<[email protected]>
<Fiz9#[email protected]>
<[email protected]>
| Subject: RE: NotSupportedException with charset conversions
| Date: Thu, 28 Oct 2004 22:01:01 -0700
| Lines: 115
| Message-ID: <[email protected]>
| MIME-Version: 1.0
| Content-Type: text/plain;
| charset="Utf-8"
| Content-Transfer-Encoding: 7bit
| X-Newsreader: Microsoft CDO for Windows 2000
| Content-Class: urn:content-classes:message
| Importance: normal
| Priority: normal
| X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.0
| Newsgroups: microsoft.public.dotnet.framework
| NNTP-Posting-Host: TK2MSFTNGXA03.phx.gbl 10.40.1.29
| Path: cpmsftngxa10.phx.gbl!TK2MSFTNGXA03.phx.gbl
| Xref: cpmsftngxa10.phx.gbl microsoft.public.dotnet.framework:83041
| X-Tomcat-NG: microsoft.public.dotnet.framework
|
| Hi Alice. Did you hear anything more from either the
internationalization
| team (agreeing to fix the underlying bug) or the .NET team (agreeing to
not
| throw an exception when .NET encounters these characters)?
|
| Thanks,
|
| Jonathan
|
| "Alice Gu [MS]" wrote:
|
| > Hi, Jonathan,
| >
| > It seems that this problem ( a few EUC-JP chars failed to roundtrip to
| > UTF-16 and back) is known. I am putting in your suggestion as a wish
item
| > in the development team's list. They will consider it for future
versions
| > of Windows.
| >
| > Thank You,
| > Alice Gu
| > Microsoft Developer Support
| >
| > This posting is provided "AS IS" with no warranties, and confers no
rights.
| > Use of included script samples are subject to the terms specified at
| > http://www.microsoft.com/info/cpyright.htm
| >
| >
| > --------------------
| > | X-Tomcat-ID: 8721697
| > | References: <[email protected]>
| > <T#[email protected]>
| > <[email protected]>
| > | MIME-Version: 1.0
| > | Content-Type: text/plain
| > | Content-Transfer-Encoding: 7bit
| > | From: (e-mail address removed) (Alice Gu [MS])
| > | Organization: Microsoft
| > | Date: Sat, 23 Oct 2004 00:24:58 GMT
| > | Subject: RE: NotSupportedException with charset conversions
| > | X-Tomcat-NG: microsoft.public.dotnet.framework
| > | Message-ID: <Fiz9#[email protected]>
| > | Newsgroups: microsoft.public.dotnet.framework
| > | Lines: 59
| > | Path: cpmsftngxa10.phx.gbl
| > | Xref: cpmsftngxa10.phx.gbl microsoft.public.dotnet.framework:82719
| > | NNTP-Posting-Host: TOMCATIMPORT1 10.201.218.122
| > |
| > | Hi, Jonathan,
| > |
| > | You are right. This is actually not a .net problem, but rather on
how
| > | MLANG converts UTF-16 to EUC-JP. I've reproduced the problem in a
pure
| > | mlang application ( no .net involved). I am checking with the folks
who
| > | wrote MLANG on this. Will update this group when I hear back from
them.
| > |
| > | Thank You,
| > | Alice Gu
| > | Microsoft Developer Support
| > |
| > | This posting is provided "AS IS" with no warranties, and confers no
| > rights.
| > | Use of included script samples are subject to the terms specified at
| > | http://www.microsoft.com/info/cpyright.htm
| > |
| > |
| > | --------------------
| > | | Thread-Topic: NotSupportedException with charset conversions
| > | | thread-index: AcSw1aFlDhdbhvWtQuiKr89DYSx5XA==
| > | | X-WBNR-Posting-Host: 211.128.107.244
| > | | From: =?Utf-8?B?Sm9uYXRoYW4gTGV2aW5l?= <[email protected]>
| > | | References: <[email protected]>
| > | <T#[email protected]>
| > | | Subject: RE: NotSupportedException with charset conversions
| > | | Date: Tue, 12 Oct 2004 20:35:05 -0700
| > | | Lines: 19
| > | | Message-ID: <[email protected]>
| > | | MIME-Version: 1.0
| > | | Content-Type: text/plain;
| > | | charset="Utf-8"
| > | | Content-Transfer-Encoding: 7bit
| > | | X-Newsreader: Microsoft CDO for Windows 2000
| > | | Content-Class: urn:content-classes:message
| > | | Importance: normal
| > | | Priority: normal
| > | | X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.0
| > | | Newsgroups: microsoft.public.dotnet.framework
| > | | NNTP-Posting-Host: TK2MSFTNGXA03.phx.gbl 10.40.1.29
| > | | Path: cpmsftngxa10.phx.gbl!TK2MSFTNGXA03.phx.gbl
| > | | Xref: cpmsftngxa10.phx.gbl microsoft.public.dotnet.framework:81925
| > | | X-Tomcat-NG: microsoft.public.dotnet.framework
| > | |
| > | | I found a few more exceptions in converting from UTF-16 to EUC-JP:
| > | |
| > | | 0x301C (should probably convert to 0xA1C1, but instead generates
the
| > same
| > | | exception as for 0x2212)
| > | | 0x2016 (should probably convert to 0xA1C2, but instead generates
the
| > same
| > | | exception as for 0x2212)
| > | |
| > | | In addition, there seems to be at least one cosmetic issues:
| > | |
| > | | 0x00A6 (halfwidth broken bar) converts to 0x7c (vertical bar),
should
| > | | probably convert to 0x8fa2c3 (broken bar)
| > | |
| > | | but I'm guessing this is more a code page issue than a .NET issue.
| > | |
| > | | Regards,
| > | |
| > | | Jonathan
| > | |
| > | | http://kerblog.com/earlyedition/archive/2004/10/12/210.aspx
| > | |
| > |
| > |
| >
| >
|
 
Top