S
sprash
I am writing a program to get the mode of a given set of numbers.
As a refresher, mode is the number occuring most often in a list and
1. if 'n' numbers have the same maximum frequency in the list, all
those 'n' numbers are modes
2. If the maximum frequency is 1, the list does not have a mode.
Understandably, this list can be potentially huge and so the list of
modes can be huge. However, the number of modes in a list that contains
'n' elements cannot be more than 'n/2' due to the above rules.
My algorithm simply sorts the array and does a linear scan to measure
the length of all adjacent runs.
The question is, what data structure should I use to hold the mode list
keeping in mind that my list can grow as much as n/2 and can also be as
short as 1. Also note that my mode list can suddenly shrink from a
large number of elements to as small as 1 if it discovers a number
which occurs more than those contained in a list.
I am considering an ArrayList with initial capacity n/2 to hold the
mode, but I do realize that this is inefficient because:
1. The data it will hold will be a value type, hence the performance
penalty in boxing my Int32s
2. I might not even use n/2 values. But then how do I decide when to
use TrimToSize?
Any suggestions?
As a refresher, mode is the number occuring most often in a list and
1. if 'n' numbers have the same maximum frequency in the list, all
those 'n' numbers are modes
2. If the maximum frequency is 1, the list does not have a mode.
Understandably, this list can be potentially huge and so the list of
modes can be huge. However, the number of modes in a list that contains
'n' elements cannot be more than 'n/2' due to the above rules.
My algorithm simply sorts the array and does a linear scan to measure
the length of all adjacent runs.
The question is, what data structure should I use to hold the mode list
keeping in mind that my list can grow as much as n/2 and can also be as
short as 1. Also note that my mode list can suddenly shrink from a
large number of elements to as small as 1 if it discovers a number
which occurs more than those contained in a list.
I am considering an ArrayList with initial capacity n/2 to hold the
mode, but I do realize that this is inefficient because:
1. The data it will hold will be a value type, hence the performance
penalty in boxing my Int32s
2. I might not even use n/2 values. But then how do I decide when to
use TrimToSize?
Any suggestions?