I need a sorted HashSet

N

not_a_commie

The new SortedSet<T> class in .NET 4 rejects inserts when
IComparible.CompareTo(..) == 0. That's lame. The two Ordered
collections in PowerCollections behave the same way. None of those
three are actually a HashSet. Is anyone aware of an ordered HashSet
implementation?
 
R

Rick Lones

not_a_commie said:
The new SortedSet<T> class in .NET 4 rejects inserts when
IComparible.CompareTo(..) == 0. That's lame. The two Ordered
collections in PowerCollections behave the same way. None of those
three are actually a HashSet. Is anyone aware of an ordered HashSet
implementation?

I'm unaware that such a thing even could exist, at least without major
hoop-jumping under the covers. The underlying structure (buckets) of a hash
table (and presumably of a hash set) is optimized for random access but is not
ordered by key. If you want both random AND key-ordered sequential access you
need a SortedList, SortedSet, Dictionary, etc. Their underlying structure (best
case a tree) permits sequential access as well as random. A HashSet is
IEnumerable but I would think that you could not expect a key-sorted order to
result if it is truly based on a hashing algorithm.

It's unclear how any of that relates to your IComparable complaint, though. If
you mean that duplicate keys are not allowed, that is also true for hash tables
and dictionaries of pretty much all kinds, AFAIK. You can always force a unique
compound key by appending, e.g., a timestamp or sequence number, to your
"natural" key. Of course you would have to re-implement IComparable so as to
account for that.

HTH,
-rick-
 
M

Marcel Müller

Rick said:
I'm unaware that such a thing even could exist,

I am pretty sure that it can't exist.

Every definition of 'sorted' implies the definition of a sequence rather
than a set. The latter has no particular order.

A hash table on the other side does not require a strict weak ordering
of it's elements. It only compares for equality. And you will never get
any sort order by equality comparison.
but is not ordered by key. If you want both random AND key-ordered
sequential access you need a SortedList, SortedSet, Dictionary, etc.

No, Dictionary is also a hash table with no ordering information.
I think you mean SortedDictionary which is a binary tree implementation.

You could use a combination of SortedDictionary and Dictionary if you
need both, a sorted sequence and O(1) read access to the elements. But
on write you will not end up with anything below O(log n) because you
have to keep the lists in sync. Any database developer would kiss your
feet if you find a (scalable) algorithm that allows amortized O(1) write
access to an ordered sequence. This would require a defined insert
position with a constant number of comparisons.


Marcel
 
R

Registered User

The new SortedSet<T> class in .NET 4 rejects inserts when
IComparible.CompareTo(..) == 0. That's lame.

The behavior is documented
The two Ordered
collections in PowerCollections behave the same way. None of those
three are actually a HashSet. Is anyone aware of an ordered HashSet
implementation?

On the web there are various examples of ordering hashtables using
extension methods. An alternative is to roll your own type. Again
there are examples on the network.

The real issue is identical items cannot be sorted by comparison; any
ordering of identical items is purely arbitrary.

regards
A.G.
 
N

not_a_commie

collections in PowerCollections behave the same way. None of those
Yes.  SortedSet<T>.

It simply makes no sense for the same element to be present more than
once in a set.  

Really? You're going to limit the definition of set to say that it
can't have any duplicates? (And I suppose I'd be okay with that
If you refuse to provide a way to distinguish between two elements that
have the same basic key you're using for the set...

But I did give it a way to distinguish between elements -- the hash
code is different. It's just the comparison that returns an equal
value between the occasional two elements. However, SortedSet<T> cares
more about the comparison than it does the hash code.
 
N

not_a_commie

The real issue is identical items cannot be sorted by comparison; any
ordering of identical items is purely arbitrary.

I don't really care what order the items are in if they have different
hash codes but the comparison shows they're equal. That's standard
behavior on when sorting.
 
G

Gene Wirchenko

Attribution lost:

Really? You're going to limit the definition of set to say that it
can't have any duplicates? (And I suppose I'd be okay with that

I hope so. That is what a set is. (See your local, friendly
mathematician for more details.) If you require duplicates, then do
not use a set.
definition in this situation if SortedSet<T> actually used the hash
code to make that determination.)

[snip]

Sincerely,

Gene Wirchenko
 
R

Registered User

I don't really care what order the items are in if they have different
hash codes but the comparison shows they're equal. That's standard
behavior on when sorting.

It's a bit of a conundrum that you both do and don't care how the
individual items are sorted. There is no standard behavior which
resolves such a conundrum.

Rethink what you're trying to accomplish. The sequence
A,B,B,C,E,E,F
can be created without comparing the individual letters to determine
the desired order. That sequence can be created by grouping identical
letters and then ordering the groups as desired.
{A},{B,B},{C),{E,E},{F}

Implemententing such behavior should not be difficult.

regards
A.G.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top