Regex question (easy?)

sb · Sep 22, 2006

Hello,
I have a text file which contains plain text with the normal
carriage-return/linefeed line terminators. With that file I want to find
any occurence of "%R" (case-sensitive) on any line that does _not_ start
with "#"....so that I can replace it with something else later using
Regex.Replace().

An example file would look something like this (the real file is
~100kb..several hundred lines):

--- start of file ---

# description line...if there are any %R's on this line, ignore them\r\n
hey %Rthere\r\n
how's it %Rgoin?\r\n
%Rfine, you?\r\n

--- end of file ---

In the above, the regular expression should match all occurences of %R
except the one on the line that starts with "#". Can someone tell me what
the right regex search string would be? I'm sure this is very easy to do
but I'm still a newbie to regex.

Thanks in advance!
sb

Cor Ligthert [MVP] · Sep 22, 2006

sb,

If you are a newbie to Regex, than you have in my idea to avoid it or try to
learn it.

These newsgroups are meant to help you learn to fish, not to give you fish,
for that you have to buy them in a regular shop.

RegexLib
http://www.regexlib.com/Default.aspx

Expresso
http://www.ultrapico.com/Expresso.htm

I hope this helps a little bit?

Cor

Jon Skeet [C# MVP] · Sep 22, 2006

sb said:
I have a text file which contains plain text with the normal
carriage-return/linefeed line terminators. With that file I want to find
any occurence of "%R" (case-sensitive) on any line that does _not_ start
with "#"....so that I can replace it with something else later using
Regex.Replace().

An example file would look something like this (the real file is
~100kb..several hundred lines):

--- start of file ---

# description line...if there are any %R's on this line, ignore them\r\n
hey %Rthere\r\n
how's it %Rgoin?\r\n
%Rfine, you?\r\n

--- end of file ---

In the above, the regular expression should match all occurences of %R
except the one on the line that starts with "#". Can someone tell me what
the right regex search string would be? I'm sure this is very easy to do
but I'm still a newbie to regex.

I wouldn't use regex at all. The regular expression is going to be
harder to understand than straight calls to String methods. Assuming
you've already read a line (with StreamReader.ReadLine) you can just
do:

if (!line.StartsWith ("#") && line.Contains ("%R"))

sb · Sep 23, 2006

Thanks for the response Jon. I coded it your way before I posted...which
works fine of course. However, I think I over-simplified my original post

I'm essentially building an rtf file from a text file (generated by another
app...not mine) that contains a lot of proprietary tags like %R, %Y,
etc...which represent color tags. To build a proper rtf file color table, I
need to ensure that a color is actually used within the file. So in short,
I need to perform the replacements for each color tag and also know that at
least one replacement was actually made before I add that tag's color to the
color table. I realize I that I can accomplish this with simple string
functions...ie Replace(), IndexOf(), Contains(). However, I figured that
using a regex may be quicker (and more readable) in that I can do the
parsing once during the creation of a MatchCollection and then just perform
the replacements using the Groups within that Match Collection.

I admit that maybe I'm optimizing too early here

-sb

Jon Skeet [C# MVP] · Sep 23, 2006

sb said:
Thanks for the response Jon. I coded it your way before I posted...which
works fine of course. However, I think I over-simplified my original post

I'm essentially building an rtf file from a text file (generated by another
app...not mine) that contains a lot of proprietary tags like %R, %Y,
etc...which represent color tags. To build a proper rtf file color table, I
need to ensure that a color is actually used within the file. So in short,
I need to perform the replacements for each color tag and also know that at
least one replacement was actually made before I add that tag's color to the
color table. I realize I that I can accomplish this with simple string
functions...ie Replace(), IndexOf(), Contains(). However, I figured that
using a regex may be quicker (and more readable) in that I can do the
parsing once during the creation of a MatchCollection and then just perform
the replacements using the Groups within that Match Collection.

I admit that maybe I'm optimizing too early here

It *may* be faster to use a regular expression - but I wouldn't worry
about that until you've got a working implementation to start with. It
would possibly be more readable to use a regular expression if the
reader is very familiar with regular expressions - but *much* harder
for people who don't use regular expressions very often.

I suggest you write the code in the most obvious way, using Replace,
IndexOf etc, get all your unit tests in place (preferrably doing that
before implementing the code, in fact) and then you can safely change
to using regular expressions if you feel there'll be a benefit.

sb · Sep 23, 2006

Ok, thanks for the advice Jon. I'll leave it as-is for now

-sb

regex help	2	Aug 24, 2004
Preserve carriage returns in RichTextBox control	3	Dec 1, 2008
Regex problem with with carriage returns	1	Mar 30, 2005
Simple Regex Replace() question.	2	Dec 23, 2007
Regex String searching for quotes	3	Jul 11, 2007
Simple regex question!	2	Apr 6, 2006
How to find and replace something that is nested inside something else?	6	May 31, 2007
Regex: Split string where line starts with known value?	2	Sep 3, 2003

Regex question (easy?)

sb

Cor Ligthert [MVP]

Jon Skeet [C# MVP]

sb

Jon Skeet [C# MVP]

sb

Ask a Question

Similar Threads