Regex - Is this the best expression to match with...

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

This works but I was wondering if there is a better way to write this
expression?

Input text: multi-word | optional space = optional space | single word | ;
optional carriage return
---------------------------------
sometext=something;
moretext = something;
other text = something;

Expression:
 
: This works but I was wondering if there is a better way to write this
: expression?
:
: Input text: multi-word | optional space = optional space | single
: word | ; optional carriage return
: ---------------------------------
: sometext=something;
: moretext = something;
: other text = something;
:
: Expression:
: ---------------------------------
: (?<key>(\w+\s*)+)\s*=\s*(?<data>(\w+\s*)+);[\r\n]*

I'd write it this way:

(?<key>[\w\s]+?)\s*=\s*(?<data>\S+);\s*

The non-greedy + in key serves to right-trim the matched substring.
As a rule of thumb, wrapping layers of quantifiers gives the matcher
*lots* of possibilities to try, so avoid writing such patterns.

In your spec above, you said data is a single word, but you used
the same pattern as for key. Using \S+ matches a run of non-whitespace
characters.

I used \s* at the end rather than [\r\n]* because this looks like a
human-edited config file, and in that context, it's good to be
forgiving about invisible characters. If that reasoning does apply,
consider using ;+ (i.e., one or more semicolons) too.

Hope this helps,
Greg
 
Back
Top