PC Review


Reply
Thread Tools Rate Thread

Character Escapes Don't Work in VB Regex Replace?

 
 
=?Utf-8?B?Q2hyaXMgQW5kZXJzb24=?=
Guest
Posts: n/a
 
      25th Feb 2004
Anyone know of a fix (ideally) or an easy workaround to the problem of escape characters not working in regex replacement text? They just come out as literal text

For example, you'd think that thi

Regex.Replace("<stuff>text</stuff>", "<stuff>", "<stuff>\n"

would give yo

<stuff
text</stuff

But it doesn't. Instead, you get

<stuff>\ntext</stuff

That's a totally arbitrary example. Escape characters just don't work at all in VB regex replacements- any of them. They work in C#, but that doesn't do me much good unless I start over

Thanks
Chri

 
Reply With Quote
 
 
 
 
Brian Davis
Guest
Posts: n/a
 
      25th Feb 2004

You can just append the actual characters in using Environment.NewLine or
vbLf. VB does not use character escapes in strings like C#, instead you
just concatenate the string with the character, i.e.


Regex.Replace("<stuff>text</stuff>", "<stuff>", "<stuff>" &
Environment.NewLine)
-or-
Regex.Replace("<stuff>text</stuff>", "<stuff>", "<stuff>" & vbLf)


Note that this is not specific to Regex.Replace; it is just that C# supports
character escapes in strings while VB does not.


Brian Davis
www.knowdotnet.com



"Chris Anderson" <(E-Mail Removed)> wrote in message
news0CDF1AE-ACE1-4858-B7DE-(E-Mail Removed)...
> Anyone know of a fix (ideally) or an easy workaround to the problem of

escape characters not working in regex replacement text? They just come out
as literal text.
>
> For example, you'd think that this
>
> Regex.Replace("<stuff>text</stuff>", "<stuff>", "<stuff>\n")
>
> would give you
>
> <stuff>
> text</stuff>
>
> But it doesn't. Instead, you get.
>
> <stuff>\ntext</stuff>
>
> That's a totally arbitrary example. Escape characters just don't work at

all in VB regex replacements- any of them. They work in C#, but that doesn't
do me much good unless I start over.
>
> Thanks,
> Chris
>



 
Reply With Quote
 
=?Utf-8?B?Q2hyaXMgQW5kZXJzb24=?=
Guest
Posts: n/a
 
      25th Feb 2004
I guess I should have pointed that out in the original posting..

I could certainly do it like that or by a few other means in the code. But the problem is that I'm loading replace expressions dynamically from a text file and I need the ability (specifically) to allow Unicode characters such as \u200e in the replace expression, and also the whole range of general character escapes. I can't hard code those. I could come up with a little parser to replace character escapes from the script before sending them to the regex replace, but that's a pain. There's no reason that VB shouldn't support this. The regex engine is part of the .net framework and should conform to common functionality, as far as I'm concerned. Anyway, regardless of the fact that VB doesn't generally support escape characters doesn't mean that they shouldn't work in regex. Afterall, $1 means nothing special in VB, but does in a regex replace string

Now I'm just ranting... Point being, I need to do this the way it should work. I'm hoping there's either some sort of update available or some crazy secrect syntax such as $#\\u002e to do what I need

Thanks.
 
Reply With Quote
 
Jay B. Harlow [MVP - Outlook]
Guest
Posts: n/a
 
      25th Feb 2004
Chris,
As Brian suggested:

VB.NET just does not support C# escape sequences, nor does VB.NET define its
own escape sequences!


The "\n" is only supported in regular expressions and replacement patterns,
not the replace string according to the Character Escapes section of Regular
Expression Language Elements:

http://msdn.microsoft.com/library/de...geElements.asp

This is not a VB.NET problem per se, C# & every other .NET language will
have the same problem, as the RegEx class itself is defining this behavior.
(Read as I hope your rant is against the RegEx class and not VB.NET! ;-))

Unfortunately I do not know of a predefined routine that will replace the C#
escape sequences with their respective characters. If you build one, I would
recommend using a StringBuilder in the implementation.

Hope this helps
Jay

"Chris Anderson" <(E-Mail Removed)> wrote in message
news:7E5B4138-0CC6-4BA1-AAD1-(E-Mail Removed)...
> I guess I should have pointed that out in the original posting...
>
> I could certainly do it like that or by a few other means in the code. But

the problem is that I'm loading replace expressions dynamically from a text
file and I need the ability (specifically) to allow Unicode characters such
as \u200e in the replace expression, and also the whole range of general
character escapes. I can't hard code those. I could come up with a little
parser to replace character escapes from the script before sending them to
the regex replace, but that's a pain. There's no reason that VB shouldn't
support this. The regex engine is part of the .net framework and should
conform to common functionality, as far as I'm concerned. Anyway, regardless
of the fact that VB doesn't generally support escape characters doesn't mean
that they shouldn't work in regex. Afterall, $1 means nothing special in VB,
but does in a regex replace string.
>
> Now I'm just ranting... Point being, I need to do this the way it should

work. I'm hoping there's either some sort of update available or some crazy
secrect syntax such as $#\\u002e to do what I need.
>
> Thanks.



 
Reply With Quote
 
=?Utf-8?B?Q2hyaXMgQW5kZXJzb24=?=
Guest
Posts: n/a
 
      25th Feb 2004
Well, I figured I was hoping too much. I'd still call it a bug, though. It actually says in the documentation for regex that all character escapes are supported both in regular expressions and replacement patterns

http://msdn.microsoft.com/library/de...cterescapes.as

Unfortunately, it's only half true in VB. I only mention C# because I wanted to see if this was a VB specific issue, or the .net regex engine. Using escape characters in C# replacement patterns does work as it says in the documentation, and as in every other language or application that implements regex. I've been using VB for 6 years, and think .net is great, but it's very disappointing and confusing that VB does implement the regex behavior in a standard way. I'll probably end up writing my own fix, and share it with anyone else who is frustrated by this.
 
Reply With Quote
 
Jay B. Harlow [MVP - Outlook]
Guest
Posts: n/a
 
      26th Feb 2004
Chris,
Correct, the "replacement pattern" which is the second argument to
Regex.Replace (in the sample you gave) supports \n, which is the pattern to
match.

However! The "replacement", which is the third argument to Regex.Replace,
which is your argument with \n in it, is not listed on the page you gave.

http://msdn.microsoft.com/library/de...laceTopic6.asp

>> Using escape characters in C# replacement patterns
>> does work as it says in the documentation, and as
>> in every other language or application that implements regex.

No! They do not work!! :-| What documentation? (not the page you gave!)

They do not work in that RegEx will not honor them, however C# itself may.
Try the following (in C#):

string s =
System.Text.RegularExpressions.Regex.Replace(@"<stuff>text</stuff>",
@"<stuff>", @"<stuff>\n");
System.Diagnostics.Debug.WriteLine(s);

Where I am telling C# not to replace C#'s escape sequences.

Notice that the result still contains the \n, as Regex does not modify the
\n in the replacement text.

Hope this helps
Jay


"Chris Anderson" <(E-Mail Removed)> wrote in message
news:2AC493A6-E389-44EF-A482-(E-Mail Removed)...
> Well, I figured I was hoping too much. I'd still call it a bug, though. It

actually says in the documentation for regex that all character escapes are
supported both in regular expressions and replacement patterns.
>
>

http://msdn.microsoft.com/library/de...terescapes.asp
>
> Unfortunately, it's only half true in VB. I only mention C# because I

wanted to see if this was a VB specific issue, or the .net regex engine.
Using escape characters in C# replacement patterns does work as it says in
the documentation, and as in every other language or application that
implements regex. I've been using VB for 6 years, and think .net is great,
but it's very disappointing and confusing that VB does implement the regex
behavior in a standard way. I'll probably end up writing my own fix, and
share it with anyone else who is frustrated by this.


 
Reply With Quote
 
Jay B. Harlow [MVP - Outlook]
Guest
Posts: n/a
 
      5th Mar 2004
Chris,
If you are still following this thread...

While searching for something else, I just came across RegEx.Escape and
RegEx.Unescape that will escape & unescape strings for you (including
whitespace). It may help in your efforts.

http://msdn.microsoft.com/library/de...scapeTopic.asp

http://msdn.microsoft.com/library/de...scapeTopic.asp

Hope this helps
Jay


"Chris Anderson" <(E-Mail Removed)> wrote in message
news:2AC493A6-E389-44EF-A482-(E-Mail Removed)...
> Well, I figured I was hoping too much. I'd still call it a bug, though. It

actually says in the documentation for regex that all character escapes are
supported both in regular expressions and replacement patterns.
>
>

http://msdn.microsoft.com/library/de...terescapes.asp
>
> Unfortunately, it's only half true in VB. I only mention C# because I

wanted to see if this was a VB specific issue, or the .net regex engine.
Using escape characters in C# replacement patterns does work as it says in
the documentation, and as in every other language or application that
implements regex. I've been using VB for 6 years, and think .net is great,
but it's very disappointing and confusing that VB does implement the regex
behavior in a standard way. I'll probably end up writing my own fix, and
share it with anyone else who is frustrated by this.


 
Reply With Quote
 
 
 
Reply

Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Use regex.replace to replace a group #, not the whole match string justin.mayes@gmail.com Microsoft VB .NET 1 2nd Jun 2006 11:07 PM
Cannot replace parenthesis using regex.replace? =?Utf-8?B?U3RlcGhhbmU=?= Microsoft Dot NET 16 13th Oct 2005 07:17 PM
regex replace pipe character GlennH Microsoft C# .NET 3 6th Apr 2005 12:19 AM
Performance Regex Replace vs StringBuilder Replace Jay B. Harlow [MVP - Outlook] Microsoft VB .NET 8 22nd Mar 2004 01:08 AM
Replace methode, Replace Function, Stringbuilder replace, Regex Replace, Split Cor Microsoft VB .NET 4 1st Mar 2004 02:50 PM


Features
 

Advertising
 

Newsgroups
 


All times are GMT +1. The time now is 05:52 PM.