Niki Estner said:
I've changed the relevant part this way:
Random rnd = new Random(10);
for (int i=0; i<stringLength; i++)
builder.Append((char)(rnd.Next()%10+'0'));
bigString = builder.ToString();
Console.WriteLine(bigString.Substring(0,20));
Output is: 13771437564146648143.
Well we're at least dealing with the same string then
That gives me roughly the same results for String.IndexOf and using a
regular expression.
And, I wouldn't call it a "problem" ;-)
We don't know the OP's typical data, so random data should give at least a
clue. And I'm using average-case performance.
Well, you're using "whatever case we happen to get" performance -
without knowing anything about the real world data, we don't know what
"average-case" performance is going to be at all. For instance, where
did you get "1234" from? The results change very dramatically if you
change it to "123" for instance, presumably because "123" is found
fairly early in the string, which is where String.IndexOf does better.
(Searching for "123" gives results of 2.89s for String.IndexOf, and
4.52s for regular expressions, for a million iterations (one test
only.))
Yes, but in this case you didn't answer the OP's question, you instead
corrected mine, and I think that's inappropriate unless my answer is wrong.
I'm not sure that "corrected" is the right word for what I did -
certainly "commented on".
I don't think my original reply was wrong: you might want to read it again -
I don't think you would have given that guy a different advice, (at least
considering the performance data I have).
If I'd had the same results you'd had, I would probably have
recommended REs, yes - but definitely with a caveat that it's worth
checking that this really is the performance bottleneck, as IndexOf is
more readable and in most cases has "good enough" performance. When
advocating any "more complex" code for performance reasons, I tend to
put such a caveat in. (I consider regular expressions more complex as
you need to be careful about the search string - if it contains any
characters that are "special" for regexes, you may need to escape them,
for instance.)
Having looked at my reply to you, btw, I noticed I'd used the phrase
"hard-coded", and so I thought I'd apply that to our particular test. I
assume this is pretty much what Boyes-Moore does:
static int Find1234 (string haystack)
{
int index=0;
int max = haystack.Length-3;
while (index < max)
{
if (haystack[index]=='1')
{
if (haystack[index+1]=='2')
{
if (haystack[index+2]=='3')
{
if (haystack[index+3]=='4')
{
return index;
}
else
index += 3;
}
else
index+=2;
}
else
index++;
}
else
index++;
}
return -1;
}
(I think that's correct, isn't it?)
I haven't tried to optimise it any further, but it already just about
out-performs the other ways in a few test cases. (That's not using
compiled reg-exes, admittedly - just your original test code.)