How to process millions of strings?

  • Thread starter Thread starter Bernie Walker
  • Start date Start date
B

Bernie Walker

I have been using arraylists for manipulating text data. I have been having
decent results for what I want to do untill I exceed somwhere around 500000
items in the arraylist. The text data is tabular in nature and I have been
sorting and combining the data. The memory required to process this becomes
an issue with larger sets of data. I am looking for suggestions on how to
handle millions of rows of data that is relatively quick and memory
efficient. I am relatively new to C# so simpler is better....

Thanks for your consideration,

Bernie.
 
Take a look at datasets/datatables/dataviews ... although they might be a
bit heavier they have alot of the behaviors you want built in i.e. sorting
etc ... You can also do things like aggregate functions on them.

Cheers,

Greg
 
Bernie,

Take a look at ternary search trees. A ternary search tree is a data structure that is specifically designed for searching and sorting text data. Since I don't know the precise specifications of your particular problem I can't guarentee that TST's will work for you, but it might be worth researching. Of course, the framework doesn't contain an implementation of a TST so you'd have to write one yourself.

Brian
 
Back
Top