ASCII/UNICODE Encoding

D

Davide Piras

Hello,

..NET Framework 2, C#, ASP.NET web application...

I have the string "öäü" and I need the ascii rapresentation in the format:
"\u00F6" stands fo "ö" and so on...

then later I need to convert again from the string "\u00F6....." to the
original one "öäü"

tried some paths but cannot make the encoders and classes of the System.Text
namespace working :-(

what is the simplest way to do something like that?

at this moment are converting by hand with this:

(char)Convert.ToInt32(m.Value.Replace("\\u", ""), 16)

that casts \u00F6 back to ö

but I think there should be a better way...

Thanks, regards, Davide.
 
B

Barry Kelly

Davide said:
Hello,

.NET Framework 2, C#, ASP.NET web application...

I have the string "öäü" and I need the ascii rapresentation in the format:
"\u00F6" stands fo "ö" and so on...

ASCII only covers 128 character points. What you're talking about there
is a code page like Latin 1 (ISO 8859-1) or Windows-1252.

Try this out:

---8<---
using System;
using System.Text;

class App
{
static void Main()
{
Encoding e = Encoding.GetEncoding("ISO-8859-1");
byte[] encoded = e.GetBytes("öäü");
foreach (byte b in encoded)
Console.Write("{0:X2} ", b);
Console.WriteLine();
}
}
--->8---

Prints out the following on my machine:

F6 E4 FC

-- Barry
 
M

Mihai N.

ASCII only covers 128 character points. What you're talking about there
is a code page like Latin 1 (ISO 8859-1) or Windows-1252. ....
Encoding e = Encoding.GetEncoding("ISO-8859-1");

Since .NET strings are Unicode, going thru ISO-8859-1 might damage some
characters. And the \u (popular in Java .properties files) also suggests
that Davide really wants Unicode values.
 
B

Barry Kelly

Davide said:
Hello,

.NET Framework 2, C#, ASP.NET web application...

I have the string "öäü" and I need the ascii rapresentation in the format:
"\u00F6" stands fo "ö" and so on...

then later I need to convert again from the string "\u00F6....." to the
original one "öäü"

Ah, now I think I understand: you're looking to create escapes for
characters outside 0-127. That ought to be fairly simple for most cases:
create a string builder, check each char value, compare it with 127, if
it's higher, AppendFormat("\\u{0:X4}", (int) charValue). rather than
just Append(charValue).

Similarly, on the other end, you'll have to scan for those \us. There
isn't any built-in functionality that does this, as far as I know. It
might make sense to code up this functionality as a subclass of
Encoding, for reusability reasons.

-- Barry
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top