Fast String Processing in .NET

  • Thread starter Thread starter rawCoder
  • Start date Start date
R

rawCoder

Hi,

Consider a server which receives delimiter based string data from client.

It needs to process this data very fast.

Now if its built in C++, one obvious idea would be to put the string on heap
as char* and then pass the pointer around, delimiter separated fields will
be accessed via pointers as well ( <map> of pointers to fields ). this will
mean that there is no extensive performance nor memory consumption.

So how this can be done in the most performance friendly way in .NET

Please let me know if there is more info req on the problem ( this is based
on a real life problem )

Thank you in advance
rawCoder
 
StringBuilder doesnt exactly provide what i am looking for.
It does provide super fast concatenation and mutability but what I really
need is something like referencing substrings inplace.

rawCoder
 
AFAIK you could also access this a an array of chars, as a stream or even
using pointers...

You'll have likely to do some testing to select the appropriate method.

Patrice
 
rawCoder said:
Consider a server which receives delimiter based string data from
client.

It needs to process this data very fast.

I always find such statements to be suspicious. This is the same as saying
"as fast as possible", which, given an infinite amount of development time,
could be quite fast. But no project will support an infinite amount of
development time.

If this is your requirement, change it to something concrete, like "must
process strings from sample data set X with an average time of 12 ms and a
maximum time of 50 ms" or something similar. If this is someone elses
requirement, get them to clarify it in a similar manner.

This allows you to determine when this requirement has been met and when you
can move on to other requirements such as scalability and robustness.
Now if its built in C++, one obvious idea would be to put the string
on heap as char* and then pass the pointer around, delimiter
separated fields will be accessed via pointers as well ( <map> of
pointers to fields ). this will mean that there is no extensive
performance nor memory consumption.

So how this can be done in the most performance friendly way in .NET

Well, for one thing, "in .NET" would include Managed C++ where you could do
this precisely as you've stated.

Secondly, presuming you want to use one of the libraries, I would start with
the Regex library. If you can define your text processing in terms of that,
you likely would find very good performance, given a precompiled RegEx
expression.

http://msdn.microsoft.com/library/d...stemtextregularexpressionsregexclasstopic.asp

(tiny version:)

http://tinyurl.com/9s84v


Thirdly, if that still isn't good enough, you may want to step back into
(Managed) C++ and look into using the Spirit parser.

HTH

--
Reginald Blue
"I have always wished that my computer would be as easy to use as my
telephone. My wish has come true. I no longer know how to use my
telephone."
- Bjarne Stroustrup (originator of C++) [quoted at the 2003
International Conference on Intelligent User Interfaces]
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top