Translating a text to its Unicode Representation

  • Thread starter Thread starter Jonathan Seidner
  • Start date Start date
J

Jonathan Seidner

Hi,
I have a program which contains a textbox,
in this text box the user types this string:
"\ub7c6\ub9d1\ubbd9\ubdc8\ubfcb\uc1b2\uc3a7\uc586\uc7a9\uc9b0\ucbfb"

this is NOT a unicode string because the textbox obviously wraps the text
so it does not include any escape characters, if i were to put it in code of
the program like this: string s = "\ub7c6\ub9d1\ubbd9" it would have
represented
a unicode text.

my question is this:
how could i translate a text which i assume would look like so:
"\\ub7c6\\ub9d1\\ubbd9"
in the debugger (because it was entered via the textbox control)
to the actual unicode string?

this is very confusing, took me a while to understand why things
did not work as expected.

thanks,
Jonathan.
 
Jonathan Seidner said:
I have a program which contains a textbox,
in this text box the user types this string:
"\ub7c6\ub9d1\ubbd9\ubdc8\ubfcb\uc1b2\uc3a7\uc586\uc7a9\uc9b0\ucbfb"

this is NOT a unicode string because the textbox obviously wraps the text
so it does not include any escape characters

Well, it *is* a unicode string - it's just not being interpreted as a
string of C# escape sequences.
if i were to put it in code of
the program like this: string s = "\ub7c6\ub9d1\ubbd9" it would have
represented
a unicode text.

my question is this:
how could i translate a text which i assume would look like so:
"\\ub7c6\\ub9d1\\ubbd9"
in the debugger (because it was entered via the textbox control)
to the actual unicode string?

this is very confusing, took me a while to understand why things
did not work as expected.

I suggest you search for "\\u" (escaped; "\u" unescaped) in the string
and parse the four hex digits after that, then convert the whole of
that part into the appropriate unicode character (just cast the integer
to a char). Of course, you'll have to watch out for the user typing in
"\\ub7c6" etc...
 
Hi,
That's exactly my question, how do I "parse the four hex digits"? In other
sense, how do I take the string \\ub7c6 and convert it to the unicode
character? what class/method should I use?

thanks,
Jonathan.
 
Jonathan Seidner said:
That's exactly my question, how do I "parse the four hex digits"? In other
sense, how do I take the string \\ub7c6 and convert it to the unicode
character? what class/method should I use?

Use int.Parse (or ushort.Parse) specifying NumberStyles.HexNumber, and
giving it the appropriate substring.
 
Great!
Thanks!

Jonathan.

Jon Skeet said:
Use int.Parse (or ushort.Parse) specifying NumberStyles.HexNumber, and
giving it the appropriate substring.
 
Maybe you could use System.Text.RegularExpressions.Regex.Unescape()
alternatively... No other coding required and it works wonders...

Jonathan Seidner said:
Great!
Thanks!

Jonathan.
 
Back
Top