unique values in an array

A

Alan Beban

Harlan Grove wrote:. . > . . .but note that your parameter to omit
blanks only eliminates
zero-length strings. If the input array were a range, and that range
contained blank cells (in the ISBLANK sense), those cells would be
recorded as Empty rather than "", so using your optional parameter
wouldn't eliminate them. My point is that IF a user would want to
eliminate both "" and Empty, it's more efficient to reduce the array
to the distinct values POSSIBLY INCLUDING "" and Empty, then eliminate
the at most single remaining "" and Empty items.

No. Despite the fact that empty cells are recorded as Empty, the
Dictionary Add Method treats zero-length strings and empty cells both
the same; it loads "" into the dictionary, and once it's loaded either,
it won't load the other. So the optional ArrayUniques parameter does
indeed eliminate both "" and Empty.

Alan Beban
 
H

Harlan Grove

Alan Beban said:
No. Despite the fact that empty cells are recorded as Empty, the
Dictionary Add Method treats zero-length strings and empty cells
both the same; it loads "" into the dictionary, and once it's loaded
either, it won't load the other. So the optional ArrayUniques
parameter does indeed eliminate both "" and Empty.

My diagnosis was wrong. ArrayUniques does something different wrong.

You're assigning CStr(Elem) to Dictionary keys, and that does convert
blank cells to "", but it also eliminates the distinction between,
say, 0 the number and "0" the text string, which means your
ArrayUniques may be eliminating too many items from its passed array.
If you assign Elem itself, whatever type it may happen to be, to
Dictionary keys, "" and Empty will be treated as distinct values.
 
A

Alan Beban

Harlan said:
. . .
Now that you mention it, this lies in an inefficient block of code.

On Error Resume Next
For Each Elem In arr
x.Add Item:=Elem, Key:=CStr(Elem)
Next
If OmitBlanks Then x.Remove ("")
On Error GoTo 0

There's no need for the enclosing error trapping, which slows things
down. Just test whether each new array value is already loaded into
the dictionary, and only add those that aren't. Similarly, only remove
"" if it exists.

For Each Elem In arr
If not x.Exists(Elem) Then x.Add Item:=0, Key:=Elem
Next
If OmitBlanks And x.Exists("") Then x.Remove ""

You're correct. A quick and dirty test suggests that it's on the order
of a tenth of a second faster for an array of 10,000 elements.

Alan Beban
 
A

Alan Beban

Harlan said:
. . .
Now that you mention it, this lies in an inefficient block of code.

On Error Resume Next
For Each Elem In arr
x.Add Item:=Elem, Key:=CStr(Elem)
Next
If OmitBlanks Then x.Remove ("")
On Error GoTo 0

There's no need for the enclosing error trapping, which slows things
down. Just test whether each new array value is already loaded into
the dictionary, and only add those that aren't. Similarly, only remove
"" if it exists.

For Each Elem In arr
If not x.Exists(Elem) Then x.Add Item:=0, Key:=Elem
Next
If OmitBlanks And x.Exists("") Then x.Remove ""
Item:=0 needs to be Item:=Elem (or the later y = x.Items needs to be
changed to y = x.Keys)

Alan Beban
 
H

Harlan Grove

Alan Beban said:
Harlan Grove wrote: ....

Item:=0 needs to be Item:=Elem (or the later y = x.Items needs to be
changed to y = x.Keys)

Yup, make the latter assignment y = x.Keys.
 
A

Alan Beban

Harlan said:
My diagnosis was wrong. ArrayUniques does something different wrong.

You're assigning CStr(Elem) to Dictionary keys, and that does convert
blank cells to "" . . .
If you assign Elem itself, whatever type it may happen to be, to
Dictionary keys, "" and Empty will be treated as distinct values.

Nope (you should really test this stuff before you state conclusions).
After your comment about error trapping being inefficient, I substituted
your suggested snippet, except that because I later had y = x.Items to
load the Dictionary items into an array, I changed Item:=0 in your
snippet to Item:=Elem. As you suggest, "" and Empty were treated as
distinct values to the extent that if a blank cell precedes the first
"", then the Add Method appears to load Empty rather than "", but it
still fails (appropriately) to load the subsequent zero-length string,
and the loaded Empty is treated as "" in the later If x.Exists("")
statement; i.e., again assuming an empty cell precedes the zero-length
string in the range/array being tested for duplicates, Empty is loaded
rather than "" and the zero-length string is treated as a duplicate, but
If x.Exists("") nevertheless returns True and the Empty is removed with
x.Remove ""
You're assigning CStr(Elem) to Dictionary keys, and that . . .
eliminates the distinction between,
say, 0 the number and "0" the text string, which means your
ArrayUniques may be eliminating too many items from its passed array.

Nope; again an erroneous apparently untested conclusion. Assigning Elem
itself to the Dictionary keys eliminates the distinction between 0 and
"0"; assigning CStr(Elem) preserves it.

These results may not be the way you think VBA works, or would like or
expect it to work, but them's the facts.

Alan Beban
 
A

Alan Beban

Harlan said:
. . .
Now that you mention it, this lies in an inefficient block of code.

On Error Resume Next
For Each Elem In arr
x.Add Item:=Elem, Key:=CStr(Elem)
Next
If OmitBlanks Then x.Remove ("")
On Error GoTo 0

There's no need for the enclosing error trapping, which slows things
down. Just test whether each new array value is already loaded into
the dictionary, and only add those that aren't. Similarly, only remove
"" if it exists.

For Each Elem In arr
If not x.Exists(Elem) Then x.Add Item:=0, Key:=Elem
Next
If OmitBlanks And x.Exists("") Then x.Remove ""

The snippet I'm using is

For Each Elem In arr
If not x.Exists(Elem) Then x.Add Item:=Elem, Key:=CStr(Elem)
Next
If OmitBlanks Then x.Remove ""

Item:=Elem because the later code uses y = x.Items to load the array

Key:=Cstr(Elem) to preserve the distinction between, e.g., 0 and "0"
(and it continues to allow blanks and zero-length strings to be treated
as duplicates)

Delete And x.Exists("") as superfluous

Thanks for the constructive treatment; even the erroneous conclusions
were instructive.

Alan Beban
 
H

Harlan Grove

Alan Beban said:
Harlan Grove wrote... ....
Nope (you should really test this stuff before you state
conclusions). . . . the Add Method appears to load Empty rather than
"", but it still fails (appropriately) to load the subsequent
zero-length string, and the loaded Empty is treated as "" in the
later If x.Exists("") statement; . . .
....

You're right, but I consider this a bug in the Dictionary object to be
worked around, requiring more code when filling the Dictionary.

For Each x In a
If Not d.Exists(x) Then d.Add Key:=x, Item:=0

'workaround for "" and Empty keys treated as the same
If IsEmpty(x) Then
d.Item("") = d.Item("") Or 2 'set 2nd bit
If i < 0 Then i = 2 'Empty appears first
ElseIf x = "" Then
d.Item("") = d.Item("") Or 1 'set 1st bit
If i < 0 Then i = 1 '"" appears first
End If
Next x

x = IIf(d.Exists(""), d.Item(""), 0)

More code when calculating the result array's 1st dimension bounds.

k = LBound(a, 1)
'workaround for "" and Empty keys treated as the same
If x < 3 Then 'either Empty or "" only appears or neither did
n = d.Count + k - 1
x = Null
Else 'both Empty and "" appeared
n = d.Count + k
x = IIf(i = 2, "", Empty) 'the one that appeared LAST
End If

And different code to fill the result array.

'workaround for "" and Empty keys treated as the same
If Not IsNull(x) Then a(n) = x

For j = 0 To d.Count - 1
a(j + k) = d.Keys(j)
Next j

or

'workaround for "" and Empty keys treated as the same
If Not IsNull(x) Then a(n, i) = x

For j = 0 To d.Count - 1
a(j + k, i) = d.Keys(j)
Next j
Nope; again an erroneous apparently untested conclusion. Assigning
Elem itself to the Dictionary keys eliminates the distinction
between 0 and "0"; assigning CStr(Elem) preserves it.

These results may not be the way you think VBA works, or would like
or expect it to work, but them's the facts.

This one I did test. Guess you either didn't or botched it.

Let's test them facts, shall we?

In the module where ArrayUniques resides, run this macro

Sub hgtest()
Dim v As Variant, x As Variant
v = ArrayUniques(Array(1, "1", "foo"))
Debug.Print LBound(v, 1); UBound(v, 1)
For Each x In v
Debug.Print TypeName(x), x
Next x
End Sub

The resulting Debug output is

1 2
Double 1
String foo

which shows ArrayUniques eliminates "1". Swap the order of 1 and "1",
and the Debug output becomes

1 2
String 1
String foo

this time eliminating 1.

Now run similar tests using my revised adistinct function.

Sub hgtest()
Dim v As Variant, x As Variant
v = adistinct(Array(1, "1", "foo", Empty, ""))
Debug.Print LBound(v, 1); UBound(v, 1)
For Each x In v
Debug.Print TypeName(x), x
Next x
End Sub

which produces the following Debug output

0 4
Integer 1
String 1
String foo
Empty
String

BTW, note that ArrayUniques stores numeric 1 as a Double while
adistinct stores it as an Integer. FTHOI, reverse the order of 1 and
"1" in the latter test procedure, and the Debug output becomes

0 4
String 1
Integer 1
String foo
Empty
String

So what exactly was the point you were trying to make?
 
A

Alan Beban

Harlan said:
. . .


This one I did test. Guess you either didn't or botched it.

Botched it. I was running the equivalent of

Set rng = Range("b1:b3")
v = ArrayUniques(rng)

where b1:b3 contained

1
"1"
foo

which produces the output

1 3
Double 1
String "1"
String foo

Back to the drawing boards.

Alan Beban
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top