Unexplainable Worksheet Function Calculation Process

N

Nicholas Dreyer

I have gotten these symptoms to show up in both Excel 2003 and Excel
2000.

The following downloadable sheet, has just the one user-defined
function (in cell selected when opened). To make testing manageable,
calculation is purposely set to "manual" and "not on save". After
opening, pressing F9 (calculate key) takes you through 1981 passes of
the function until the sheet thinks it is calculated. The function
has a 2001-cell range argument. Calculation can be forced by
editing/changing any single cell in the range D18 through AA18

As usual, during calculation you can interrupt the process by
selecting a cell on the sheet before it is done. Checking the seconds
per function pass dumped to the Immediate Window, you will be able to
estimate roughly how long it will take to finish after 1981 passes (5
minutes on one of my machines, 10 on another).

http://www.oz.net/~gurfler/Download/TestBugN.xls

Once you have let the calculation complete, it never takes that long
again. The following sheet is in that state, but is in no other
respect different:

http://www.oz.net/~gurfler/Download/TestBugOK.xls

Diagnostics illustrating very strange behavior during each pass of the
function are dumped to the Immediate Window in the following (just hit
F9 as soon as it is loaded):

http://www.oz.net/~gurfler/Download/TestBugDebug.xls

These diagnostics indicate that on first pass the function only sees
one cell in its range-argument as non-empty, next pass two cells,
etc., more or less, until everything clears up after 1981 passes (why
20 short of 2001 ???).

Making the range argument of the function into a worksheet array is a
great way to prevent this horrible stall in calculation, though the
diagnostic here does show that even then it still takes two passes to
complete, first pass apparently all cells come through empty.

http://www.oz.net/~gurfler/Download/TestBugDebugArray.xls

OK, no smart-alec comments: I know that this function merely
duplicates what can be achieved through the following formula using
exclusively excel's built in functions:

{=AVERAGE(SMALL(AJ27:AJ2027,AG28:AG128))} (as an ArrayFormula)

My purpose is only to test the functionality of user defined functions
like this, and the example provided here shows some limitations, or at
least the need for work-arounds in some instances.

Does anybody know what is causing this behavior, and how to mitigate
its effects? Is it really necessary/advisable to create arrays for
all unser-defined function arguments, or can something else be done to
help avoid this bizarre behavior.

Thanks, Nick
 
C

Charles Williams

That's a wonderful example of the way Excel's recalculation process works
(this behaviour is by design).

see http://www.decisionModels.com/Calcsecretsj.htm for an explanation and
examples of how to make this UDF more efficient,
and associated pages for an explanation of Excel's calculation process.

The function below runs in about 0.6 of a second as compared to over 300 on
your TestBugN example
(note you also have to change your sort routine to handle a 2-d variant
array).

Function AverageUnderTailof(Distribution As Range, atEnd As Long, Number As
Long) As Double
Dim i As Long, StartofTail As Long, EnterTime As Double

Dim vArr As Variant

EnterTime = Timer
If funcount = 0 Then StartTime = EnterTime
vArr = Distribution.Value2
If AreAnyEmpty(vArr) Then Exit Function
StartofTail = 1 + atEnd * (UBound(vArr) - Number)
AverageUnderTailof = 0
With SortOrderof(vArr)
For i = StartofTail To StartofTail + Number - 1
AverageUnderTailof = AverageUnderTailof + vArr(.Values(i), 1)
Next
End With
AverageUnderTailof = AverageUnderTailof / Number
funcount = funcount + 1
Debug.Print Format(funcount, "0"), Format(Timer - EnterTime, "0.0000"),
_
Format(Timer - StartTime, "0.0000"),
Format(AverageUnderTailof, "#,##0.00")
End Function

Function AreAnyEmpty(vArr As Variant) As Boolean
Dim j As Long

AreAnyEmpty = False
If VarType(vArr) >= 8192 Then
For j = 1 To UBound(vArr)
If IsEmpty(vArr(j, 1)) Then
AreAnyEmpty = True
Exit For
End If
Next j
Else
If IsEmpty(vArr) Then AreAnyEmpty = True
End If
End Function

regards
Charles
___________________________________
London Excel Users Conference April 1-2
The Excel Calculation Site
http://www.decisionmodels.com
 
N

Nicholas Dreyer

Charles:

First off thanks for posting this extraordinarily thorough solution.

Now I must exclaim, what a remarkable coincidence it is to see this
great response of yours today!

Earlier today I found your extremely informative web site which
already has explained enough for me to see (sort of) what was going
on, at least enough to be able to come up with the solution to just
filter all cells of incoming ranges for this condition displayed on
your site:

ISEMPTY(Cell.Value) AND Len(Cell.formula)>0.

Your, no doubt better, solution below does raise a few questions for
which answers, if you are in a position, and so inclined to provide
them would help me greatly understand a few long-standing VBA puzzles.


First, the most significant is invoked by your use of the statement

vArr = Distribution.Value2

Given that you must be much more versed in the efficiencies of VBA
coding than I, I must assume that using vArr - with it's awkward extra
dimension - has advantages over directly accessing a range argument.
If so, I am very curious what they would be.

The way the assignment above is done in fact reminds me of a strange
asymmetry I have never been able to comprehend: It is possible to,
with extreme efficiency, assign very large two-dimensional VBA arrays
to ranges by a simple assignment

Range object = VBA ArrayVariable

The above looks like it might be the way I have been looking for
achieving the reverse. Does going through vArr improve the relatively
sluggish VBA response time seen when accessing range values by
referencing each individual cell? This would be quite a revelation to
me if you can say this is true, as I have never been able to figure
out how to get similar speeds loading massive range values into VBA
variables as you do loading large arrays back into excel sheets.

Second, I do not understand the difference between value2 and value.
In particular, is this distinction crucial in the context of your
solution?

Thanks a-million Charles. I will be studying your site for some time
to come, and look forward to any further insights you might be able to
post here regarding my follow-up queries.

Nick
 
N

Nicholas Dreyer

Oh, and I forgot this simple question:

Is there a reason you do not use what seems like the more
comprehensive check

ISEMPTY(Cell.Value) AND Len(Cell.formula)>0

in your "AreAnyEmpty" function?

THanks, Nick
 
N

Nicholas Dreyer

Charles:

Just comformed the efficiencies I suspected you can with your methods.
I too see less than a second using your "AreAnyEmpty" function.
compared to original troublesome 300 without.

Your vArr assignment must be doing something important too, since it
takes 16 seonds using this simple filtering loop at the top of my
"AverageUnderTailof" function

For i = 1 To Distribution.Cells.Count
If IsEmpty(Distribution(i).Value) And _
Len(Distribution(i).Formula) > 0 Then Exit Function
Next

Nick
 
N

Nicholas Dreyer

Charles:

I don't quite get the "by design phrase" you use:

Why does Excel even bother to start calculating any cell formula
expression (such as the one containing my, or any user defined
function) until it has finished calculating all precedents?

Or is such an approach incompatible with the general calculation
optimizing scheme used by excel? It certainly seem unintuitive, but
them very little in optimization solutions ever is . . .

Anyway any thought, however brief on this would be greatly
appreciated, as it sure appeared to me (and I would imagine many
others) to more of a design "flaw" than "feature".

If the answer is on your site, I have not yet found it . . .

Thanks again, Nick
 
C

Charles Williams

Hi Nick,

To answer your questions:

There is a significant overhead associated with initiating transferring
stuff between Excel and VBA.
I always think of VBA and Excel being separated by a very large hill.

See Variant Benchmark on my downloads page to measure the relative size of
the overhead on your system, and also
http://www.decisionmodels.com/VersionCompare.htm

So the larger the amount you transfer in each read or write (use the largest
truck possible for the transport over the hill) the more efficient it is.

There is also a small tunnel through the hill through which VBA can reach
and manipulate things on the Excel side (tunnel=the object model). For
instance if you want to use a worksheet function in your VBA (MATCH for
example) its more efficient for VBA to reach through the tunnel and tell
Excel what to do on the Excel side of the hill and only bring back the
result of the MATCH, rather than transporting all the data over the hill.

Using .VALUE2 is more efficient than using .VALUE (and usually safer as
well) because it does not do the additional implicit conversion for currency
(which may lose precision) and date formatted data.
Unfortunately the default is .VALUE

Why does Excel evaluate Cells containing references to uncalculated cells?
I have not done an analysis, but I suspect that in general its more
efficient when recalculating because the calculation chain will be well
ordered so its not worth doing the extra work of checking.
And there are probably cases where it needs to do this anyway (for instance
a formula containing multiple components some of which reference calculated
cells and some of which reference uncalculated cells, and maybe finding
circular references).


Charles
___________________________________
London Excel Users Conference April 1-2: book now not many places left!
The Excel Calculation Site
http://www.decisionmodels.com
 
N

Nicholas Dreyer

Charles:

Well thanks once again. You have now not only helped me completely
solve my original dilemma, but on the side revealed some wonderful
additional solutions and insights, some of which I had long been
seeking.

I was already operating under the large truck analogy, but did not
know what truck to use for one direction of large array transfers:
Predimensioned arrays can be sent to excel ranges in one assignment
statement, but the other direction just does not work. Until I saw
your use of an undimensioned Variant for that purpose, I could only
loop through all Range Cells to get their values into a VBA Array, i.e
using the smallest truck possible.

Sorry I won't be in London for your conference, but your help for me
here in Seattle will be long remembered and passed on wherever I can.

Nick
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top