Jon said:
Looks like he's compiling it every time in both cases - which is stupid
on both platforms.
One thing to note is that this test almost certainly says *nothing*
about the performance of the languages. Instead, it is a benchmark of
the regular expression implementations.
Oh, and it was created in 2003. Worth trying again, preferably having
fixed the code to not be braindead.
On my laptop the code never finishes as posted (at least not in the time
I allowed it to run), but if I remove the RegexOptions.Compiled, it
completes in about 6.5 seconds. I have no idea how the java program
would fare on my machine so I won't vouch for its performance compared
to the "equivalent" java program.
Let's see what the documentation says about .Compiled:
"Specifies that the regular expression is compiled to an assembly. This
yields faster execution but increases startup time. This value should
not be assigned to the Options property when calling the
CompileToAssembly method."
Ok, so lets check what Pattern.compile does in java:
"Compiles the given regular expression into a pattern."
Hm, not saying much exactly but from docs and all the sources I can
find, Pattern.compile(...) is the same as new Regex(...) in .NET.
So in .NET, it is compiled into a binary assembly running the same kind
of code as though you built the state machine from scratch, and in java
it builds the data structures.
Well, duh, this will take more time. There is no point in compiling the
regular expression into an assembly if its to be executed once, like the
code posted does.
This is the main problem with most benchmarks that try to prove one
language better than another, they invariably fall short on some kind of
user problem.
In any case, as Jon pointed out, this isn't a benchmark that tells you
anything useful. If anything, it tells you that of the two programs, the
java one completed, and the C# one didn't and ran a lot slower to boot.
It doesn't say anything about what can be accomplished if you use the
platform to its fullest, or at least avoid mistakes like this.
And besides, measuring the performance of C# would be a whole other
task. What the program *attempts* to do is measure one part of the BCL
against a similar one in Java.
And a rather bad attempt at that.