Custom character escape in a regular expression?

B

Bob

I am trying to parse a fairly simple expression:

arg1|arg2|arg3

The problem is that in arg2, sometimes I see the pipe character. The
difference is that the pipe character is prefixed by a backslash. For
example:

arg1|a\|rg2|arg3

Still at this point it wouldn't be to hard to parse, but the problem is that
these arguments may also contain any other standard character escapes. For
example:

arg2\n|a\\\|rg2\\|arg3

Should match:

arg2{lf}
a\|rg2\
arg3

Is it possible to do this?

Thanks
 
K

Kevin Spencer

(?:\\\\|\\\||\\\w+|\w+)+

I used a slight variation of yours with a oine break to test:

arg2\n|a\\\|rg2\\|arg3|
a\|rg4

The way this regular expression works is using alternation. In regular
expressions, alternations always work in the order in which they appear.
That is, the alternate is only used if the first pattern fails to match. So,
first, it looks for "\\" - which effectively eliminates all "\\"
combinations from the group. This leaves:

arg2\n|a\|rg2|arg3|
a\|rg4

Next, it looks for the "\|" combination, and by matching that, effectively
eliminates it from the group, leaving:

arg2\n|arg2|arg3|
arg4

Next, it looks for the "\w" combination (any backslash followed by any
"word" character), leaving:

arg2|arg2|arg3|
arg4

Finally, it checks for any "w" (word) character, leaving:

|||

It groups the matches into a single group, which it defines as any number of
the matches contained in it, so that there is one match per item.

--
HTH,

Kevin Spencer
Microsoft MVP
Professional Numbskull

Hard work is a medication for which
there is no placebo.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top