Use of C-dll, an optimization?

  • Thread starter Thread starter Vitling
  • Start date Start date
V

Vitling

I've got a C# program that uses the algorithm for Levenstein Distance.
(C-source at http://www.merriampark.com/ldc.htm) This function is
called _very_ frequently, and uses up lots of cpu time. Therefore Im
interested in any optimizations that would speed things up a bit.

Would there be anyting to gain by using the algorthim as a compiled
C-dll in my C# program? I figure there would be some additional
overhead in the DLL handling, and I'd have to add unicode->ansi
conversion (Marshal.StringToHGlobalAnsi()) on both of the argument
strings.

Any ideas are very much appreciated.

Thanks!
 
Hi Vitling:

Since you are frequently calling the function I suspect the interop
overhead might actually slow you down - but measuring your application
in your environment is the only way to tell for sure.

Perhaps you could post the C# code in a short but complete program and
we could make some suggestions.
 
Daniel Jin said:
pinvoke itself certainly has overhead. and in addition, I've read that
unicode->ansi conversion is very expensive. any performance you might gain
from going unmanaged probably will be negated.
I took a look at the C code, it seems simple enough that I don't see why
managed code can't do a decent job. the only additional cost you will incur
is the bound checks on array access. some of which might even be jitted
away. you can avoid all bound checks using unsafe code, stackalloc the int
array, access string with a char*.
like scott said, only way to see the performance is by testing. do both
unsafe and pinvoke and see which is faster.

Definitely. However, I'm going to guess that the C# implementation would be
quicker. Here's why:

n=strlen(s);
This is O(n) in C, O(1) for Framework strings.

d=malloc((sizeof(int))*(m+1)*(n+1));
malloc is significantly slower than a Framework allocation. Of course,
there's a price to be paid in garbage collection, but that will be done
later and anyway a gen.0 collect is very fast.

int minimum(int a,int b,int c)
As it stands this function probably won't be inlined in C. I'd expect it to
be in C# though.

Everything else
Will be JIT-compiled in C# so I wouldn't have thought there'd be a
noticeable difference.

On top of that, you'll have a DLL/interop overhead if you go via the C
route. The string unicode <-> ansi issue could be overcome using wide-chars.

However I'm willing to be proved wrong, and as others have said, only
measurement can tell you the actual answer.

(If you do write it in C#, don't hoist the string length checks either -- I
believe the JIT can optimise away bounds checking if you leave them where
they ought to be, in the for statement).

Stu
 
I too had an opportunity of using a C++ DLL in a C# DLL
and that has too many problems when used with the pinvoke.
I got rid only after making my C# dll a COM+ one.
 
I've got a C# program that uses the algorithm for Levenstein Distance.
(C-source at http://www.merriampark.com/ldc.htm) This function is
called _very_ frequently, and uses up lots of cpu time. Therefore Im
interested in any optimizations that would speed things up a bit.

Would there be anyting to gain by using the algorthim as a compiled
C-dll in my C# program? I figure there would be some additional
overhead in the DLL handling, and I'd have to add unicode->ansi
conversion (Marshal.StringToHGlobalAnsi()) on both of the argument
strings.

Any ideas are very much appreciated.

Thanks!


Thanks for all help. I think Im going to stick to my C# version, since
any performance gain seems to be minor (at best).
 
Back
Top