zaisaki said:
I have a data set with variables id, a, b, c, d and looks like:
1 2 3 5 1
1 5 6 5 3
1 2 4 2 4
2 1 3 4 5
2 5 6 2 3
2 4 5 2 1
I would like to summarize data for each id and create a new variable
representing each code occuring in variables a, b and c.
Essentially the new data set should have variables id, v1 - v6.
v1 represents the number of times code 1 occurs in variables a, b and c
in the original data set, and so on for v2 - v6.
The data set would like;
1 1 3 2 2 3 1
2 2 2 2 2 1 1
....
Given your source data, neither your 1st or 2nd result rows are consistent
with counting only fields a-c. The 1st row is consistent with counting
fields a-d, the 2nd isn't consistent with any set of fields since the 2nd
through 7th columns total 10, which isn't a multiple of 3 (the number of
rows in your source data with ID=2). If you meant to count fields a-d, your
second row should be
2 2 2 2 2 3 1
You could do this entirely with formulas, or a pivot table if you add field
names. As for formulas, if your source data was in a range named TBL, and
the top-left result cell were G5, try
G5:
=INDEX(TBL,1,1)
G6 [array formula]:
=INDEX(TBL,MATCH(0,COUNTIF(G$5:G5,INDEX(TBL,0,1)),0),1)
H5:M5 [array formula]:
=MMULT(COLUMN(INDIRECT("RC1:RC"&COUNTIF(INDEX(TBL,0,1),$G5),0))^0,
COUNTIF(OFFSET(TBL,SMALL(IF(INDEX(TBL,0,1)=$G5,ROW(TBL)),
ROW(INDIRECT("1:"&COUNTIF(INDEX(TBL,0,1),$G5))))-ROW(INDEX(TBL,1,1)),1,1,
COLUMNS(TBL)-1),{1,2,3,4,5,6}))
Select H5:M5 and fill down into H6:M6. Then select G6:M6 and fill down as
needed.
An alternative VBA approach that ignores values in fields a-d that aren't
integers between 1 and 6 would be
Sub foo()
'requires reference to Microsoft Scripting Runtime
Dim dk As New Dictionary, dv As New Dictionary
Dim sv As Variant, rv As Variant
Dim i As Long, j As Long, k As Long, x As Variant
k = 0
'modify next line as needed
For Each x In Array(1, 2, 3, 4, 5, 6)
k = k + 1
dv.Add Key:=x, Item:=k
Next x
k = 16
ReDim rv(0 To dv.Count, 1 To k)
sv = Range("A1").CurrentRegion.Value
For i = 1 To UBound(sv, 1)
If Not dk.Exists(sv(i, 1)) Then
dk.Add Key:=sv(i, 1), Item:=dk.Count + 1
rv(0, dk.Count) = sv(i, 1)
End If
If dk.Count >= k Then
k = 2 * k
ReDim Preserve rv(0 To dv.Count, 1 To k)
End If
For j = 2 To UBound(sv, 2)
If dv.Exists(sv(i, j)) Then _
rv(dv(sv(i, j)), dk(sv(i, 1))) = rv(dv(sv(i, j)), dk(sv(i, 1))) + 1
Next j
Next i
ReDim Preserve rv(0 To dv.Count, 1 To dk.Count)
ActiveCell.Resize(dk.Count, dv.Count + 1).Value = _
Application.WorksheetFunction.Transpose(rv)
End Sub