Sorting Arraylist Of Objects

G

Guest

I would like to sort an Arraylist of objects on multiple properties. For
instance, I have a Sort Index property and an ID property (both integers).
So, the results of my sort would look like this:

sort index, id
1000,1
1000,2
1000,3
1001,1
1001,2
....

I know I can implement IComparable.CompareTo to sort on one property but I
am not completely clear on how to sort on two. Thanks for any help.
 
T

Tom Dacon

A good way to do this is as follows:

In the Compare method, for each object create a string that has the
concatenated string representations of the properties you want to sort on,
and then compare the strings.

For instance, suppose the first property in your example has a maximum
length, as a string, of five characters, and the second one has a maximum
length, as a string, of four characters. So the first property which you
give in your example could have values from zero to 99999, and the second
one could have values from zero to 9999. Then you'd do something like this:

dim str1 as string = firstObj.Property1.ToString().Padleft(5, "0"c) +
firstObj.Property2.ToString().PadLeft(4, "0"c)
dim str2 as string = secondObj.Property1.ToString().Padleft(5, "0"c) +
secondObj.Property2.ToString().PadLeft(4, "0"c)

return String.Compare(str1, str2)

You'd be comparing strings like "010000001" and "010010002".

To sort descending on both properties, you'd:

return -(String.Compare(str1, str2)

And to sort descending on just one property, subtract that property value
from the highest value it can hold and do the ToString and Padleft on that
value. So your first example line below, to sort ascending on property one
and descending on property two, would look like: "010009998".

HTH,
Tom Dacon
Dacon Software Consulting
 
G

Guest

In your compare routine, you should be able to compare both properties to see
which class instance should be first.
 
C

Chris Dunaway

Tom said:
A good way to do this is as follows:

In the Compare method, for each object create a string that has the
concatenated string representations of the properties you want to sort on,
and then compare the strings.

For instance, suppose the first property in your example has a maximum
length, as a string, of five characters, and the second one has a maximum
length, as a string, of four characters. So the first property which you
give in your example could have values from zero to 99999, and the second
one could have values from zero to 9999. Then you'd do something like this:

dim str1 as string = firstObj.Property1.ToString().Padleft(5, "0"c) +
firstObj.Property2.ToString().PadLeft(4, "0"c)
dim str2 as string = secondObj.Property1.ToString().Padleft(5, "0"c) +
secondObj.Property2.ToString().PadLeft(4, "0"c)

return String.Compare(str1, str2)

You'd be comparing strings like "010000001" and "010010002".

To sort descending on both properties, you'd:

return -(String.Compare(str1, str2)

And to sort descending on just one property, subtract that property value
from the highest value it can hold and do the ToString and Padleft on that
value. So your first example line below, to sort ascending on property one
and descending on property two, would look like: "010009998".

HTH,
Tom Dacon
Dacon Software Consulting

Why create strings when it is not necessary? All he needs to do is in
the CompareTo method to first check the value of the index. If they
are not equal then do the compare on the index. If the indexes are
equal, then do the compare on the ID:

If the indexes are not equal, then the value of the Id won't affect the
sort anyway.

If obj1.Index <> obj2.Index Then
return obj1.Index.CompareTo(Obj2.Index)
Else
return obj1.Id.CompareTo(obj2.Id)
End If
 
T

Tom Dacon

Why create strings when it is not necessary? All he needs to do is in
the CompareTo method to first check the value of the index. If they
are not equal then do the compare on the index. If the indexes are
equal, then do the compare on the ID:

If the indexes are not equal, then the value of the Id won't affect the
sort anyway.

If obj1.Index <> obj2.Index Then
return obj1.Index.CompareTo(Obj2.Index)
Else
return obj1.Id.CompareTo(obj2.Id)
End If

Just think for a moment what the code would look like if the requirement
were to sort on three, or four, or more property values. What I'm showing is
a generalized solution to the specific problem that the OP posed, that can
be used not only for this simple requirement but also for more complex
sorting requirements. Once you understand how to structure a sorting
solution this way, for the simple case given, you can easily apply it to
other and perhaps more complex problems. This is a good technique to add to
your toolkit - it's is simple and easy to remember, and scales easily as you
sort on more and more properties.

Tom Dacon
Dacon Software Consulting
 
C

Chris Dunaway

Tom said:
Just think for a moment what the code would look like if the requirement
were to sort on three, or four, or more property values. What I'm showing is
a generalized solution to the specific problem that the OP posed, that can
be used not only for this simple requirement but also for more complex
sorting requirements. Once you understand how to structure a sorting
solution this way, for the simple case given, you can easily apply it to
other and perhaps more complex problems. This is a good technique to add to
your toolkit - it's is simple and easy to remember, and scales easily as you
sort on more and more properties.

I agree that it can be a bit unwieldy for more than a few components,
but your method is creating at least 6 strings for each comparison: 4
PadLeft and 2 concatenations. And for more components, that will
increase. Sorting an arraylist with a large number of items may incur
a performace hit.

I'm just trying to say that I would think twice before creating a bunch
of strings that might incur a significant performance hit.

It would be interesting to see what the differences are and if they are
a cause for concern.
 
C

Chris Dunaway

Tom said:
a generalized solution to the specific problem that the OP posed, that can
be used not only for this simple requirement but also for more complex
sorting requirements. Once you understand how to structure a sorting
solution this way, for the simple case given, you can easily apply it to
other and perhaps more complex problems. This is a good technique to add to
your toolkit - it's is simple and easy to remember, and scales easily as you
sort on more and more properties.

And I would also say that having a nested If is not that bad. It
assumes that the properties themselves are comparable, however, so that
may be one instance where converting to a string would be applicable.
But then you only need do it when necessary. Also in your method, if
the first properties differ, why convert other properties when their
values will not affect the outcome?

Dim iResult as Integer

If obj1.prop1 = obj2.prop1 Then
If obj1.prop2 = obj2.prop2 Then
If obj1.prop3 = obj2.prop3 Then
If obj1.prop4 = obj2.prop4 Then
If obj1.prop5 = obj2.prop5 Then
iResult = obj1.prop6.CompareTo(obj2.prop6)
Else
iResult = obj1.prop5.CompareTo(obj2.prop5)
End If
Else
iResult = obj1.prop4.CompareTo(obj2.prop4)
End If
Else
iResult = obj1.prop3.CompareTo(obj2.prop3)
End If
Else
iResult = obj1.prop2.CompareTo(obj2.prop2)
End If
Else
iResult = obj1.prop1.CompareTo(obj2.prop1)
End If

Return iResult
 
T

Tom Dacon

I agree that it can be a bit unwieldy for more than a few components,
but your method is creating at least 6 strings for each comparison: 4
PadLeft and 2 concatenations. And for more components, that will
increase. Sorting an arraylist with a large number of items may incur
a performace hit.

I'm just trying to say that I would think twice before creating a bunch
of strings that might incur a significant performance hit.

It would be interesting to see what the differences are and if they are
a cause for concern.

It'll depend more on the number of elements in the array than anything else.
For small lists, you probably wouldn't be able to even measure the impact.
For lists of reasonable size there's not likely to be a noticeable impact
(such as for sorting lists of a size that you would present in a user
interface). For extremely large lists, you'd be looking at using something
besides an arraylist and a comparer anyway.

Tom
 
T

Tom Dacon

Well, suit yourself. To me this code looks really ugly, difficult to
maintain or extend, and error-prone to code (and I mean that, of course, in
the nicest possible way). And I might point out the sheer number of
individual comparisions in this algorithm, compared to the single comparison
in the algorithm under discussion.

But everyone has his own way of doing things. My intention was to offer to
the community a useful solution to the problem of sorting on multiple
properties in a list comparer, that might not occur to a lot of people.
Whether or not one uses it depends on other aspects of the problem at hand,
as has already been noted.

That's all for me on this topic,

Tom
 
C

Chris Dunaway

Tom said:
Just think for a moment what the code would look like if the requirement
were to sort on three, or four, or more property values. What I'm showing is
a generalized solution to the specific problem that the OP posed, that can
be used not only for this simple requirement but also for more complex
sorting requirements. Once you understand how to structure a sorting
solution this way, for the simple case given, you can easily apply it to
other and perhaps more complex problems. This is a good technique to add to
your toolkit - it's is simple and easy to remember, and scales easily as you
sort on more and more properties.

I created the following code to test this. The simple class that is
being sorted consists of 3 integers and 3 strings, randomly generated.
I did not implement IComparable on the class itself, but rather created
2 IComparer<SimpleClass> classes so I could keep them separate. I also
used a generic List since it was faster to create. For the comparer
that implements your method I padded all integers on the left with 0
for a total length of 5 and for the strings, I padded on the left with
the letter A to 9 characters. The strings were randomly generated with
a length from 4 to 9 characters.

For 100000 object, my method took .428 seconds. The method using
strings took 15.353 seconds. I don't know if this is conclusive, there
may be a problem with my testing methodolgy, but I would appreciate
some input. Here is the code: (I also used C# since I am faster at
developing with it)

Here are the comparer classes:

class Comparer1 : IComparer<SimpleClass>
{
public int Compare(SimpleClass x, SimpleClass y)
{
int result;

if (x.Prop1 == y.Prop1)
if (x.Prop2 == y.Prop2)
if (x.Prop3 == y.Prop3)
if (x.Prop4 == y.Prop4)
if (x.Prop5 == y.Prop5)
result = x.Prop6.CompareTo(y.Prop6);
else
result = x.Prop5.CompareTo(y.Prop5);
else
result = x.Prop4.CompareTo(y.Prop4);
else
result = x.Prop3.CompareTo(y.Prop3);
else
result = x.Prop2.CompareTo(y.Prop2);
else
result = x.Prop1.CompareTo(y.Prop1);

return result;
}
}

class Comparer2 : IComparer<SimpleClass>
{
public int Compare(SimpleClass x, SimpleClass y)
{
string str1 = x.Prop1.ToString().PadLeft(5, '0') +
x.Prop2.PadLeft(9, 'A') +
x.Prop3.ToString().PadLeft(5, '0') +
x.Prop4.ToString().PadLeft(5, '0') +
x.Prop5.PadLeft(9, 'A') +
x.Prop6.PadLeft(9, 'A');

string str2 = y.Prop1.ToString().PadLeft(5, '0') +
y.Prop2.PadLeft(9, 'A') +
y.Prop3.ToString().PadLeft(5, '0') +
y.Prop4.ToString().PadLeft(5, '0') +
y.Prop5.PadLeft(9, 'A') +
y.Prop6.PadLeft(9, 'A');

return str1.CompareTo(str2);
}
}


Here is the code to generate the objects, and sort them:

Random rnd = new Random(Environment.TickCount);
private const int instanceCount = 100000;

private void button3_Click(object sender, EventArgs e)
{
List<SimpleClass> items1 = createListOfItems();
List<SimpleClass> items2 = createListOfItems();

Stopwatch watch = new Stopwatch();

//Sort by Comparer1
watch.Reset();
watch.Start();
items1.Sort(new Comparer1());
watch.Stop();
MessageBox.Show(watch.Elapsed.ToString());

//Sort by Comparer2
watch.Reset();
watch.Start();
items2.Sort(new Comparer2());
watch.Stop();
MessageBox.Show(watch.Elapsed.ToString());
}

private List<SimpleClass> createListOfItems()
{
List<SimpleClass> list = new List<SimpleClass>();

for (int i = 0; i < instanceCount; i++)
{
SimpleClass s = new SimpleClass();
s.Prop1 = rnd.Next(1, 10001);
s.Prop2 = createRandomString(rnd.Next(4, 9));
s.Prop3 = rnd.Next(1, 10001);
s.Prop4 = rnd.Next(1, 10001);
s.Prop5 = createRandomString(rnd.Next(4, 9));
s.Prop6 = createRandomString(rnd.Next(4, 9));

list.Add(s);
}

return list;
}

private string createRandomString(int p)
{
string letters = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";

StringBuilder sb = new StringBuilder();

for (int i = 0; i < p; i++)
{
sb.Append(letters[rnd.Next(0, letters.Length)]);
}
return sb.ToString();
}
 
C

Chris Dunaway

Chris Dunaway wrote:

Oops, I forgot to post the SimpleClass:

class SimpleClass
{
private int _prop1;
public int Prop1
{
get { return _prop1; }
set { _prop1 = value; }
}

private string _prop2;
public string Prop2
{
get { return _prop2; }
set { _prop2 = value; }
}

private int _prop3;
public int Prop3
{
get { return _prop3; }
set { _prop3 = value; }
}

private int _prop4;
public int Prop4
{
get { return _prop4; }
set { _prop4 = value; }
}

private string _prop5;
public string Prop5
{
get { return _prop5; }
set { _prop5 = value; }
}

private string _prop6;
public string Prop6
{
get { return _prop6; }
set { _prop6 = value; }
}
}
 
C

Chris Dunaway

Tom said:
Well, suit yourself. To me this code looks really ugly, difficult to
maintain or extend, and error-prone to code (and I mean that, of course, in
the nicest possible way).

I don't see how it's any more difficult to maintain and extend than the
string version. Both will require changing the code and a recompile.
As for ugliness, I agree it's not the prettiest code.
And I might point out the sheer number of
individual comparisions in this algorithm, compared to the single comparison
in the algorithm under discussion.

But I believe the individual comparisons are less than the cost of
creating a new string. And in the version I posted there are at most 6
comparisons per call to the CompareTo method and as few as 2 -- to sort
on 6 fields. Whereas the string version will ALWAYS create 14 strings,
6 PadLeft and one concatenation for each object compared.
But everyone has his own way of doing things. My intention was to offer to
the community a useful solution to the problem of sorting on multiple
properties in a list comparer, that might not occur to a lot of people.

And I agree that it is clever and useful, but anytime I see alot of
strings and concatenation, it raises a red flag in my mind. Especially
since strings in .Net are immutable.
Whether or not one uses it depends on other aspects of the problem at hand,
as has already been noted.

I agree (as you pointed out elsewhere) that the number of individual
fields to be sorted on as well as the number of records to sort will
affect the relative time needed to perform the sort

Cheers!

Chris
 
T

Tom Dacon

Interesting result. My use of this technique has been with much smaller
lists, on the order of a few hundred at most. Perhaps a StringBuilder or two
might be worth inserting into the algorithm, or perhaps the conclusion to be
reached is that the technique just isn't appropriate for lists on that
scale.

Thanks for taking the time to test it.

Tom
 
C

Chris Dunaway

Tom said:
Interesting result. My use of this technique has been with much smaller
lists, on the order of a few hundred at most. Perhaps a StringBuilder or two
might be worth inserting into the algorithm, or perhaps the conclusion to be
reached is that the technique just isn't appropriate for lists on that
scale.

Thanks for taking the time to test it.

I also realized that padding on the left with the letter A for the
string properties was wrong. It think it changed the sort order so I
removed that and it took only 12 seconds.

I think I'll try the string builder and see what happens.
 
C

Chris Dunaway

Chris said:
I also realized that padding on the left with the letter A for the
string properties was wrong. It think it changed the sort order so I
removed that and it took only 12 seconds.

I think I'll try the string builder and see what happens.

Here is the string builder version but it didn't seem to save much
time:

class Comparer2 : IComparer<SimpleClass>
{
public int Compare(SimpleClass x, SimpleClass y)
{
StringBuilder sb1 = new StringBuilder(50);
StringBuilder sb2 = new StringBuilder(50);

sb1.Append(x.Prop1.ToString().PadLeft(5, '0'));
sb1.Append(x.Prop2.PadRight(9,' '));
sb1.Append(x.Prop3.ToString().PadLeft(5, '0'));
sb1.Append(x.Prop4.ToString().PadLeft(5, '0'));
sb1.Append(x.Prop5.PadRight(9, ' '));
sb1.Append(x.Prop6.PadRight(9, ' '));

sb2.Append(y.Prop1.ToString().PadLeft(5, '0'));
sb2.Append(y.Prop2.PadRight(9, ' '));
sb2.Append(y.Prop3.ToString().PadLeft(5, '0'));
sb2.Append(y.Prop4.ToString().PadLeft(5, '0'));
sb2.Append(y.Prop5.PadRight(9, ' '));
sb2.Append(y.Prop6.PadRight(9, ' '));

return sb1.ToString().CompareTo(sb2.ToString());
}
}
 
C

Chris Dunaway

Tom said:
Interesting result. My use of this technique has been with much smaller
lists, on the order of a few hundred at most. Perhaps a StringBuilder or two

You're right, for only a few hundred to a few thousand object, the
difference was only about 1 second.

While the method I proposed might be faster for very large lists, it
does sacrifice a little readablity. For smaller lists, sacrificing an
almost negligible amount of time for readability may be the way to go.

Anyway, I think we have beaten this horse into the ground!

Chris
 
T

Tom Dacon

indeed

Chris Dunaway said:
You're right, for only a few hundred to a few thousand object, the
difference was only about 1 second.

While the method I proposed might be faster for very large lists, it
does sacrifice a little readablity. For smaller lists, sacrificing an
almost negligible amount of time for readability may be the way to go.

Anyway, I think we have beaten this horse into the ground!

Chris
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top