Regular Expressions faster in Java ?..

P

pawel

I have made some comparision C# to Java RegularExpression. The problem was
to find out if the rule match some text.
Matching were done for precompiled regular expressions, in 100000 iterations
loop. Those loops were executed 11 times and average value of consumend time
was calculated. Below are codes for both classes.
And I found, that Java implementation is 2 to 5 times faster than C# (it
depends on complexity of expression).
Maybe my test were to simple? And Java made some optimisations, that the
code doesnt run (couse it really does nothing usefull)?

--
Pawe³

<<RegMatchTest.java>>
public class RegMatchTest
{

public static void main(String[] args) throws Exception
{
String pat[] = {"a*c?(d|f+)", "g.*c?(r|d+)"};
String word ;
long num = 100000;

File f = new File ("text.txt");

char buff[] = new char[(int)f.length()];
FileReader fr = new FileReader (f);
fr.read(buff);
word = new String (buff);


System.out.println("Testing for "+num+" loops.");
long avgSum = 0 ;
for (int n = 0 ; n <= 10 ; ++n)
{
long t1 = System.currentTimeMillis();
new RegMatchTest().Test1 (pat, word, num);
long t2 = System.currentTimeMillis();
System.out.println("Elapsed time : " + (t2-t1) + " ms");
if (n > 0)
avgSum += t2-t1;
}
System.out.println("\nAverage time : "+ (avgSum/10) +" ms");
}

boolean Test1 (String[] pat, String word, long len)
{
Pattern p[] = {Pattern.compile(pat[0]),
Pattern.compile(pat[1])};

boolean b = false ;
for (int n = 0 ; n < len ; ++n)
{
Matcher m = p[n%2].matcher(word);
b = m.matches();
}
return b ;
}


<<class1.cs>>
class Class1
{
[STAThread]
static void Main(string[] args)
{
string[] pat = {@"a*c?(d|f+)", @"g.*c?(r|d+)"};
string word ;
long num = 100000;

System.IO.StreamReader tr = new System.IO.StreamReader ("text.txt") ;
word = tr.ReadToEnd () ;

Console.WriteLine("Testing for "+num+" loops.");
long avgSum = 0 ;
for (int n = 0 ; n <= 10 ; ++n)
{
DateTime t1 = DateTime.Now;
new Class1().Test1 (pat, word, num);
DateTime t2 = DateTime.Now;
TimeSpan ts = t2 - t1 ;
Console.WriteLine("Elapsed time : " + (ts.TotalMilliseconds) + " ms");
if (n > 0)
avgSum += (long)ts.TotalMilliseconds;
}
Console.WriteLine("\nAverage time : "+ (avgSum/10) +" ms");
}

bool Test1 (string[] pat, String word, long len)
{

Regex[] p = {new Regex (pat[0], RegexOptions.Compiled),
new Regex (pat[1], RegexOptions.Compiled)};
bool b = false ;
for (int n = 0 ; n < len ; ++n)
{
Match m = p[n%2].Match(word);
}
return b ;
}
}
 
P

phoenix

I noticed the same in the past. Regex seems to be poorly supported by C#.
They are real slow even when compiled. I heard there are people porting the
boost package to C# but haven't found it yet.

Yves
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top