Regex bug?? Insufficient hexadecimal digits

G

Guest

I have a string that contains the \", \t, \r, \n. I need to get the xml.

sample below:
"<?xml version=\"1.0\"?>\r\n<USERS
xmlns:xsd=\"http://www.w3.org/2001/XMLSchema\"
xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\"
xmlns=\"http://www.slcorp.com\\xml\\slcorp_dtd_schema.xml\">\r\n\t<ACCT>GameTek</ACCT>\r\n\t<USER>\r\n\t\t<USER_ID>Mike</USER_ID></USER>\r\n\t</USERS>\r\n"

I have tried replacing as follows so I can get the xml. I have tried 2
approaches
(1)
str = str.Replace("\n", "").Replace("\t","").Replace("\r","").Replace("\"",
""");
This code segment (Replace("\"", """);) does not compile, the rest is okay.
-------------------------------------------------------------------------
(2)
I have also tried using Regex as follows

string str= Regex.Unescape(str); This time the exception is "Insufficient
hexadecimal digits"


Any ideas?
 
J

Jon Skeet [C# MVP]

Mori said:
I have a string that contains the \", \t, \r, \n. I need to get the xml.

sample below:
"<?xml version=\"1.0\"?>\r\n<USERS
xmlns:xsd=\"http://www.w3.org/2001/XMLSchema\"
xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\"
xmlns=\"http://www.slcorp.com\\xml\\slcorp_dtd_schema.xml\">\r\n\t<ACCT>GameTek</ACCT>\r\n\t<USER>\r\n\t\t<USER_ID>Mike</USER_ID></USER>\r\n\t</USERS>\r\n"

I have tried replacing as follows so I can get the xml. I have tried 2
approaches
(1)
str = str.Replace("\n", "").Replace("\t","").Replace("\r","").Replace("\"",
""");
This code segment (Replace("\"", """);) does not compile, the rest is okay.
-------------------------------------------------------------------------
(2)
I have also tried using Regex as follows

string str= Regex.Unescape(str); This time the exception is "Insufficient
hexadecimal digits"


Any ideas?

""" isn't a valid string. Did you mean ""?

However, I'm not entirely sure what you mean by needing to "get the
XML" - the string *is* the XML. The \r, \n etc are only escapes as far
as C# is concerned.

See http://www.pobox.com/~skeet/csharp/strings.html
 
O

Oliver Sturm

Mori said:
I have a string that contains the \", \t, \r, \n. I need to get the xml.

sample below:
"<?xml version=\"1.0\"?>\r\n<USERS
xmlns:xsd=\"http://www.w3.org/2001/XMLSchema\"
xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\"
xmlns=\"http://www.slcorp.com\\xml\\slcorp_dtd_schema.xml\">\r\n\t<ACCT>GameTek</ACCT>\r\n\t<USER>\r\n\t\t<USER_ID>Mike</USER_ID></USER>\r\n\t</USERS>\r\n"

I have tried replacing as follows so I can get the xml. I have tried 2
approaches
(1)
str = str.Replace("\n", "").Replace("\t","").Replace("\r","").Replace("\"",
""");
This code segment (Replace("\"", """);) does not compile, the rest is okay.
-------------------------------------------------------------------------
(2)
I have also tried using Regex as follows

string str= Regex.Unescape(str); This time the exception is "Insufficient
hexadecimal digits"

In addition to what Jon said, I understand you want to strip the escape
sequences from the XML string by replacing \r, \n and \t by nothing, but
replace \" by ". Right?

In that case, you need to make sure that the escape sequences aren't
recognized as such in the strings you are trying to use in your
replacement. The easiest way to do that is to use verbatim literals, like
this:

str = str.Replace(@"\n", "").Replace(@"\t","").Replace @"\r","").Replace(@"\"", @"""");

Without verbatim literals, it would have to look like this:

str = str.Replace("\\n", "").Replace("\\t","").Replace "\\r","").Replace("\\\"", "\"");

Using regular expressions is probably not the most performant way to do
this, because you'd have to do two replacements - the only advantage is
that you could replace \r, \n and \t in one go:

Regex.Replace(str, @"\\[rnt]", "");

Using Regex.Unescape doesn't make any sense here, it's got a completely
different purpose.

Now, I'm sure I made a mistake somewhere will all this escaping -
someone's going to tell me :)


Oliver Sturm
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top