Speed problems on List<T>.Insert(i, T)

  • Thread starter Thread starter Stuart
  • Start date Start date
S

Stuart

Hi everyone,

I have this app that needs to insert about 400,000+ items into a sorted
list, and I am currently using a simple insert something like this:

int index = list.BinarySearch(t);
if (index < 0)
list.Insert(~index, t);

But this gets increasingly slow, it takes 1ms to insert the first 1,000,
20ms to do the thousand between 19,000 and 20,000 and 400ms per thousand
when over 200,000. The slowdown is on the insert, the binarysearch is still
lightning quick.

Does anyong know a better way to do this? using List or otherwise. I
expected slowdown but this is quite harsh.

Thanks

Stuart
 
Stuart said:
I have this app that needs to insert about 400,000+ items into a sorted
list, and I am currently using a simple insert something like this:

int index = list.BinarySearch(t);
if (index < 0)
list.Insert(~index, t);

But this gets increasingly slow, it takes 1ms to insert the first 1,000,
20ms to do the thousand between 19,000 and 20,000 and 400ms per thousand
when over 200,000. The slowdown is on the insert, the binarysearch is still
lightning quick.

Does anyong know a better way to do this? using List or otherwise. I
expected slowdown but this is quite harsh.

Keep a dictionary to determine whether the value is already there, and
then sort the list afterwards. That'll be much faster than inserting
each time.
 
Stuart said:
Hi everyone,

Hello there.
Does anyong know a better way to do this? using List or otherwise. I
expected slowdown but this is quite harsh.

List<T> is implemented as an resizing array, so insertion of elements
requires a copy of all the following elements. This means that your
insertion will effectively be O(n^2).

Jon Skeets suggestion is to first filter for duplicates using a
datastructure that decides membership very fast, then copy those unique
elements into a list and sort that.

Ideally, .NET would have implementations of ICollection with
set-properties, but unfortunatly that's not available and you can get by
using a hashtable in the form af a Dictionary<T,V>.

For the Dictionary to work correctly the GetHashcode if items must be
properly implemented. That means that either object-identity should
decide equality or you should override GetHashCode. Yet another option
would be to implement IEqualityComparer and parametrise by that.

Here is simple example (note that the "values" if seen are useless in
this example, but required by the use-dictionary-as-set-trick)

IDictionary<Item, object> seen = new Dictionary<ItemType,object>();
foreach ( Item item in items )
seen[item] = null;
// now, seen contains exactly the unique items
Item[] item_array = new Item[seen.Count];
seen.Keys.CopyTo(item_array, 0);
Array.Sort(item_array);
 
Thanks a lot guys, works like a charm.

Stuart


Helge Jensen said:
Hi everyone,

Hello there.
Does anyong know a better way to do this? using List or otherwise. I
expected slowdown but this is quite harsh.

List<T> is implemented as an resizing array, so insertion of elements
requires a copy of all the following elements. This means that your
insertion will effectively be O(n^2).

Jon Skeets suggestion is to first filter for duplicates using a
datastructure that decides membership very fast, then copy those unique
elements into a list and sort that.

Ideally, .NET would have implementations of ICollection with
set-properties, but unfortunatly that's not available and you can get by
using a hashtable in the form af a Dictionary<T,V>.

For the Dictionary to work correctly the GetHashcode if items must be
properly implemented. That means that either object-identity should
decide equality or you should override GetHashCode. Yet another option
would be to implement IEqualityComparer and parametrise by that.

Here is simple example (note that the "values" if seen are useless in
this example, but required by the use-dictionary-as-set-trick)

IDictionary<Item, object> seen = new Dictionary<ItemType,object>();
foreach ( Item item in items )
seen[item] = null;
// now, seen contains exactly the unique items
Item[] item_array = new Item[seen.Count];
seen.Keys.CopyTo(item_array, 0);
Array.Sort(item_array);


--
Helge Jensen
mailto:[email protected]
sip:[email protected]
-=> Sebastian cover-music: http://ungdomshus.nu <=-
 
You could try to first insert all items into a list and do a quick sort in
the end. I don't know if this will be quicker, but it's worth a try...
Please to a performance test. Please tell ut how it's doing.


To insert items into a binary tree will be increasingly slower. That's the
nature of binary trees.


Regards,
Lars-Inge Tønnessen
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top