On Thu, 5 Nov 2009 03:57:32 -0800 (PST), Ray <(E-Mail Removed)> wrote:
>Hi -
>
>I'm performing a shopping cart analysis (if customers purchase product
>A what else do they also purchase?) and have run into a bit of a
>snag ...
>
>I need a procedure that will return the most frequently occurring text
>string found in a contiguous block of 4784 cells (A1:AT104). There
>are over 10,000 possible values that could be in these cells, so
>counting the occurrence of ALL of them wouldn't be that efficient ...
>
>Additionally, a significant number of the cells in the target block
>could contain the value 'end' or '#N/A' -- these values need to be
>ignored.
>
>Ideally, the solution would be scalable (allow users to select larger
>or smaller range) and also allow for choosing the Most frequent, 2nd
>most frequent, 3rd most frequent, etc values. So, something like:
>=UDFname(range,3) would return the 3rd most frequently occurring text
>string in the specified range.
>
>Any ideas of how to do such a thing? The only solutions I could find
>required either knowing what string value to look for or returning a
>list of values (sorted one way or another).
>
>Thanks for looking!
>
>//ray
Here's one suggestion.
The routine creates a list of unique entries, and the frequency of each entry.
It then computes the nth most frequent entry and returns all of the strings
that match. (So if different strings have the same frequency, it will return
all of them)
If there are multiple matches, it returns the results in an array and you can
get at them using an Index function.
This works similar to the LARGE function in that if the frequency of two items
is the same, they will take up "two slots". In other words, if the avocados
and peaches both are the most frequent, with 10 occurrences, the will both be
returned in the event Index = 1 or Index = 2.
If this is not what you want, we could add some sorting to the algorithm and
make it return unique values.
==============================================================
Function LargeStrings(rg As Range, Optional Index As Long = 1) As Variant
Dim cStrings As Collection
Dim c As Range
Dim temp(), tempResults()
Dim i As Long, j As Long
'get list of unique strings
Set cStrings = New Collection
On Error Resume Next
For Each c In rg
If Not (c.Text = "end" Or c.Text = "#NA") Then
cStrings.Add Item:=c.Value, Key:=c.Value
End If
Next c
On Error GoTo 0
If Index > cStrings.Count Then
LargeStrings = CVErr(xlErrNum)
Exit Function
End If
'put into array with count
ReDim temp(0 To 1, 0 To cStrings.Count - 1)
ReDim tempCount(0 To cStrings.Count - 1)
ReDim tempItems(0 To cStrings.Count - 1)
For i = 1 To cStrings.Count
temp(0, i - 1) = cStrings(i)
temp(1, i - 1) = WorksheetFunction.CountIf(rg, cStrings(i))
Next i
'get desired index
j = WorksheetFunction.Large(temp, Index)
'put into array
ReDim tempResults(0)
For i = 0 To UBound(temp, 2)
If temp(1, i) = j Then
tempResults(UBound(tempResults)) = temp(0, i)
ReDim Preserve tempResults(UBound(tempResults) + 1)
End If
Next i
ReDim Preserve tempResults(UBound(tempResults) - 1)
If UBound(tempResults) = 0 Then
LargeStrings = tempResults(0)
Else
LargeStrings = tempResults
End If
End Function
===========================================
--ron
|