: I think this is very simple but I am having difficult doing it.
Basically
: take a comma separated list:
: abc, def, ghi, jk
:
: A list with only one token does not have any commas:
: abc
:
: The first letter of each token (abc) must not be a number. I am simply
: trying to parse it to get an array of tokens:
: abc
: def
: ghi
: jk
:
: ...or for the single token one:
: abc
:
: I can easily do this with String.Replace and String.Split, but would
like to
: do this with regular expressions. Yet I cannot seem to get it to work,
here
: is what I have so far:
:
: String input = "abc, def, ghi, jk";
: String pattern = @"^((?<name>\D.*?)(\x2C )?)+?$";
: Match match = Regex.Match(input, pattern, RegexOptions.ExplicitCapture);
:
: Any input would be appreciated,
Consider the following code:
static void Main(string[] args)
{
string[] inputs = new string[]
{
"abc, def, ghi, jk",
"abc",
"good, 1bad, good, 2bad",
"trailingcomma,",
",",
",,",
",,,",
};
string pattern =
@"^(
(
| # ignore empties
(?<token>\D.*?) # a token worth keeping
|\d.*? # or one to ignore
)
\s* # eat trailing whitespace
(,\s*|$) # separator or done
)+$ # catch a sequence of the above
";
Regex tokens = new Regex(pattern,
RegexOptions.IgnorePatternWhitespace);
foreach (string input in inputs)
{
Match m = tokens.Match(input);
Console.WriteLine("input = [" + input + "]:");
if (m.Success)
{
if (m.Groups["token"].Captures.Count > 0)
foreach (Capture c in m.Groups["token"].Captures)
Console.WriteLine(" - [" + c.Value + "]");
else
Console.WriteLine(" - no captures");
}
else
Console.WriteLine(" - no match.");
}
}
Its output is
input = [abc, def, ghi, jk]:
- [abc]
- [def]
- [ghi]
- [jk]
input = [abc]:
- [abc]
input = [good, 1bad, good, 2bad]:
- [good]
- [good]
input = [trailingcomma,]:
- [trailingcomma]
input = [,]:
- no captures
input = [,,]:
- no captures
input = [,,,]:
- no captures
It's easy to anticipate Jon Skeet's objections to the regular
expression above, and he'd certainly be on solid ground. Passing the
result of a split through a filter would be much clearer, e.g.,
public static void ExtractGoodTokens(string[] inputs)
{
Regex goodtoken = new Regex(@"^\D");
foreach (string input in inputs)
{
ArrayList goodtokens = new ArrayList();
foreach (string token in Regex.Split(input, @"\s*,\s*"))
if (goodtoken.IsMatch(token))
goodtokens.Add(token);
Console.WriteLine("input = [" + input + "]:");
if (goodtokens.Count > 0)
foreach (string token in goodtokens)
Console.WriteLine(" - [" + token + "]");
else
Console.WriteLine(" - none");
}
}
Hope this helps,
Greg
--
I have felt for a long time that a talent for programming consists largely
of the abilty to switch readily from microscopic to macroscopic views of
things, i.e., to change levels of abstraction fluently.
-- Donald E. Knuth, "Structured Programming with go to Statements"