VB strings, MSXML, Latin 9, and the Euro

R

Rod Good

Hi,

I have an Excel application which is receiving XML via MSXML4, encoded
with charset=ISO-8859-15. The XML text may contain the Euro symbol. I
want to place the text into a VB string, and sometime later, a cell.

If I decode the XML text into a byte array, I can see that the Euro
has been correctly decoded using the Windows charset CP1252 to the
value 128 (0x80).

However if I create an empty string and append the VB Euro symbol -

Dim str as String
Dim b() as Byte

str = Chr(128)
b = str

msgbox "high byte=" & b(0) & ", low byte=" & b(1)

I find that the Euro should be encoded with high and low values of 177
& 32. So when I create a VB string from the XML text, the Euro
character is completely wrong.

I'd really appreciate any insight into what's happening here, and how
to correctly specify the charset when I create the VB string, or
alternatively, the correct character set to use on the XML encoding
side.

Many thanks,

Rod
 
S

Steve Gerrard

Rod Good said:
Hi,

I have an Excel application which is receiving XML via MSXML4, encoded
with charset=ISO-8859-15. The XML text may contain the Euro symbol. I
want to place the text into a VB string, and sometime later, a cell.

If I decode the XML text into a byte array, I can see that the Euro
has been correctly decoded using the Windows charset CP1252 to the
value 128 (0x80).

However if I create an empty string and append the VB Euro symbol -

Dim str as String
Dim b() as Byte

str = Chr(128)
b = str

msgbox "high byte=" & b(0) & ", low byte=" & b(1)

I find that the Euro should be encoded with high and low values of 177
& 32. So when I create a VB string from the XML text, the Euro
character is completely wrong.

I'd really appreciate any insight into what's happening here, and how
to correctly specify the charset when I create the VB string, or
alternatively, the correct character set to use on the XML encoding
side.

Many thanks,

Rod

I don't know about displaying it in Excel, but I can tell you this much:

The Euro symbol is Chr(128) in what on my machine (XP) is called
"Windows: Western" in the Character Map program. I assume this is the
same character set as ISO-8859-15, and it is a 256 character 8 bit
character set.

The Euro symbol in Unicode is hex 20AC. This is a 64K character 16 bit
character set. (Hex 20 is 32, and Hex AC is 172, which should be the two
bytes you got. In Windows, 16 bit values are stored low byte, high
byte).

VB strings are Unicode strings. What Excel spreadsheet cells take, I
don't know. But at least that explains more or less what is going on.
You still get to figure what to do about it.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top