best way to enumerate List<> & remove unwanted elements?

Z

Zytan

Obviously you can't just use a simple for loop, since you may skip
over elements.

You could modify the loop counter each time an element is deleted.
But, the loop ending condition must be checked on each iteration,
since the Count changes as you delete elements. I would think it is
guaranteed to be computed each time, and not cached.

So, is this the best way?

List<int> mylist = .......something......;
for (int i=0; i<mylist.Count; i++) {
if (want_to_remove) {
mylist.RemoveAt(i);
i--;
}
}

Zytan
 
N

Nicholas Paldino [.NET/C# MVP]

No, it isn't. To make it easier, you really should enumerate from the
end of the list. This way, you don't have to play around with the index
variable (in this case, i).
 
Z

Zytan

No, it isn't. To make it easier, you really should enumerate from the
end of the list. This way, you don't have to play around with the index
variable (in this case, i).

Ok. Enumerate with a for loop? And the automatic i-- each time is
sufficient for not skipping elements, I see. Thanks.

(I believe my solution still works, though, right? It's just not the
best way.)

Zytan
 
M

Mr. Arnold

Zytan said:
Obviously you can't just use a simple for loop, since you may skip
over elements.

You could modify the loop counter each time an element is deleted.
But, the loop ending condition must be checked on each iteration,
since the Count changes as you delete elements. I would think it is
guaranteed to be computed each time, and not cached.

So, is this the best way?

List<int> mylist = .......something......;
for (int i=0; i<mylist.Count; i++) {
if (want_to_remove) {
mylist.RemoveAt(i);
i--;
}
}

No, you don't decrement (i) like that, you're going to eventually blow up at
that RemoveAt(i).

You just want to do mylist.Clear if you just want to *clear* all elements
out of the list, if that's what you're trying to do.
 
M

Marc Gravell

Another option is to use RemoveAll(...) with a predicate, perhaps
inline.

Marc
 
P

Peter Duniho

Obviously you can't just use a simple for loop, since you may skip
over elements.

You could modify the loop counter each time an element is deleted.
But, the loop ending condition must be checked on each iteration,
since the Count changes as you delete elements. I would think it is
guaranteed to be computed each time, and not cached.

So, is this the best way?

I like Marc's suggestion, to use a predicate delegate to control removal..
It seems tailor-made to the exact scenario you're asking about.

Note that if efficiency is of concern, you may prefer to actually generate
a whole new List<> instance instead, copying over only the values you want
to the new List<>. The reason being that, if I recall correctly, the
List<> implementation uses an array, and insertions and removals in the
array require shifting the contents of the array. In other words, even if
you start at the end, removing elements one at a time involves copying on
average half of the array for each removal.

If you're removing a large percentage of the elements, and you start at
the end of the list, this will help by ensuring that you're shifting the
minimal number of elements with each removal. But you still have the
shift. If you're willing to create a new copy of the List<>, then you can
preallocate the List<> to be as large as is necessary to ensure no
reallocations during the processing, and then if you're concerned about
wasted memory once you're done, trim the List<>.

The docs *claim* that RemoveAll() is O(n). So it's possible that
internally, it does exactly what I describe above. It's hard to see how,
if the List<> implementation really is an array, it could be O(n)
otherwise. But there's a bunch of assumptions in the first part of this
paragraph, so if you really care it seems to me you should probably do
some direct tests between the various methods (and in particular, doing
your own copy-based algorithm vs using RemoveAll()).

If it's not that important (and frankly, it probably isn't until you have
proven to yourself that this part of the code is important for performance
in your application overall), then you should probably just use
RemoveAll() and trust the docs. :)

Finally, note that at the very least I don't see any point in incrementing
a counter that you've just decremented. While the algorithm you posted
could be greatly improved as already mentioned, at a minimum it seems to
me it should look more like this:

List<int> mylist = /* whatever */;
int i = 0;

while (i < mylist.Count)
{
if (want_to_remove)
{
mylist.RemoveAt(i);
}
else
{
i++;
}
}

Pete
 
Z

Zytan

No, you don't decrement (i) like that, you're going to eventually blow up at
that RemoveAt(i).

Why? I think it works. Remember the loop increments i itself, so
afte a remove, and decrement, and increment, i is the same value. The
loop code should jump out when there are no elements left.
You just want to do mylist.Clear if you just want to *clear* all elements
out of the list, if that's what you're trying to do.

No, that's not what I am trying to do. I thought my code was clear.
I want to enumerate all elements and remove some based on some
criteria. Perhaps all will be removed, perhaps none, likely only
some.

Zytan
 
Z

Zytan

Another option is to use RemoveAll(...) with a predicate, perhaps

Interesting, I didn't even know that existed. Thanks, Marc. I think
this is the best method, although it moves the criterion code into
another function (which may be desired if it was complex).

Zytan
 
Z

Zytan

I like Marc's suggestion, to use a predicate delegate to control removal.
It seems tailor-made to the exact scenario you're asking about.

Yes, it does.
Note that if efficiency is of concern
[snip great explanation]

I totally agree. But, efficiency is not a concern, and the list is
small (10's of elements). And yes, even if it was bigger, I shouldn't
care about it until I know it's a bottleneck.

I am also interesting in List<>'s internals. I don't think the docs
lie. Perhaps a look into the C++ STL implementation would reveal some
information about how it can act like an array, and be fast for
modification at the same time. I don't have the time ATM to look this
up.
Finally, note that at the very least I don't see any point in incrementing
a counter that you've just decremented.

Agreed. I like your code better. Don't know why I didn't think about
that.
While the algorithm you posted
could be greatly improved as already mentioned, at a minimum it seems to
me it should look more like this:

Pete, does my algorithm, however bad it is, work? I am 99% sure it
does, but people doubt it.

Zytan
 
P

Peter Duniho

[...]
Pete, does my algorithm, however bad it is, work? I am 99% sure it
does, but people doubt it.

I only saw one post doubting it, and I suspect he simply misread the
code. I've been known to do the same from time to time. :)
 
Z

Zytan

I only saw one post doubting it, and I suspect he simply misread the
code. I've been known to do the same from time to time. :)

Ok, thanks :)

I've updated all my code with the delegate, anyway. Much better!

Zytan
 
M

Marc Gravell

although it moves the criterion code into
another function (which may be desired if it was complex).

If complex, then yes - perhaps move into another function; however,
note that you can also
do this inline (as mentioned), which can actually make it easier to
use arguments, since
they will be "captured" into the inline method.

For instance:

List<int> values = ...
int keepThoseDivisibleBy = 4; // could be a parameter
values.RemoveAll(delegate (int value) {
// return true to remove, i.e. those *not* divisible
return value % keepThoseDivisibleBy != 0;
});

A dubious example, but useful for lots of scenarios,
e.g. string matching (eitherusing index-of, or perhaps compile
a Regex and test each in turn).

This should presumably get even easier when we have lambdas...
I'm hoping that I can do something like (forgive the syntax; my C#3.0
isn't too hot yet):

values.RemoveAll(value => value % keepThoseDivisibleBy != 0);

Marc
 
M

Mr. Arnold

Zytan said:
Why? I think it works. Remember the loop increments i itself, so
afte a remove, and decrement, and increment, i is the same value. The
loop code should jump out when there are no elements left.

It's been my experience with addressing an element in a loop with a
RemoveAt(i) is trouble, because the number of (i) may not be there when you
think it is suppose to be there.

I would be doing it from another function and have (i) predetermined knowing
for sure that (i) is a valid element index and would be calling a function
to do the RemoveAt(i).

You should be passing and index to RemoveAt(i) for a known predetermined
element index and not off of some in-line loop on the fly. But that's just
me.
 
P

Peter Duniho

[...]
You should be passing and index to RemoveAt(i) for a known predetermined
element index and not off of some in-line loop on the fly. But that's
just me.

While I'm not defending the overall style of that loop, what makes you
suggest that the index he's removing is NOT "a known predetermined element
index"? At the point in that code, the index "i" is guaranteed to be no
less than 0 but less than the total number of elements in the array. In
other words, within the legal values for array indexes for his list.
"mylist.Count" is reevaluated each time through the loop, and so the loop
will never be entered unless it is possible to successfully remove the
element at position "i".

Pete
 
Z

Zytan

While I'm not defending the overall style of that loop, what makes you
suggest that the index he's removing is NOT "a known predetermined element
index"? At the point in that code, the index "i" is guaranteed to be no
less than 0 but less than the total number of elements in the array. In
other words, within the legal values for array indexes for his list.
"mylist.Count" is reevaluated each time through the loop, and so the loop
will never be entered unless it is possible to successfully remove the
element at position "i".

Yes, exactly. Only if I were multi-threading would the code
potentially be fatal. And in that case, I could use a lock around the
whole thing to ensure the mylist.Count is still valid by the time I go
and use "i".

Zytan.
 
Z

Zytan

For instance:
List<int> values = ...
int keepThoseDivisibleBy = 4; // could be a parameter
values.RemoveAll(delegate (int value) {
// return true to remove, i.e. those *not* divisible
return value % keepThoseDivisibleBy != 0;
});

Oh, that's what you meant by inline! yes, I have seen that before,
that's cool. thanks for the idea. My code is complex enough that it
warrants another function, so it was fine.

Zytan
 
P

Peter Duniho

Yes, exactly. Only if I were multi-threading would the code
potentially be fatal. And in that case, I could use a lock around the
whole thing to ensure the mylist.Count is still valid by the time I go
and use "i".

And if you were multi-threading, moving the call to RemoveAll() into a
separate method wouldn't change things. You'd still need to lock the list
before messing with it.
 
Z

Zytan

And if you were multi-threading, moving the call to RemoveAll() into a
separate method wouldn't change things. You'd still need to lock the list
before messing with it.

Yes, that's right. What is does is not an atomic operation, and it
couldn't possibly hope to lock the data from anyone else who may try
to access it. That's up to you.

Zytan
 
M

Mr. Arnold

Peter Duniho said:
[...]
You should be passing and index to RemoveAt(i) for a known predetermined
element index and not off of some in-line loop on the fly. But that's
just me.

While I'm not defending the overall style of that loop, what makes you
suggest that the index he's removing is NOT "a known predetermined element
index"?

I don't like the loop in this case, because I have been in that loop doing
RemoveAt(i) and watched it go down.
 
P

Peter Duniho

I don't like the loop in this case, because I have been in that loop
doing RemoveAt(i) and watched it go down.

Well, you must have had a different bug then.

It's true that there are other reasons not to loop that way, and one of
those reasons is the likelihood of creating new bugs that cause problems.
But the code posted doesn't have any problems with correctness.

Pete
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top