SUMIF vs SUM(IF(..)) vs SUMPRODUCT

H

Harlan Grove

I'm too lazy at the moment to find the thread in which someone asked about the
relative merits of SUMPRODUCT vs SUM(IF(..)) when there were multiple
conditions. The question deserves a reply.

There's a third way to sum conditionally, SUMIF, but it only accepts criteria
that would be applied singly. (I'm ignoring the fact that it can take an array
second argument of mutually exclusive criteria.) If a single criterion is needed
and the values to be summed are in a single area range (rectangular block of
cells on a single worksheet), then SUMIF is the best way to sum conditionally.

For more complicated conditional sums, usually involving multiple criteria, both
SUM(IF(..)) and SUMPRODUCT can be used in most cases. The following formulas
should give the same results.

=SUM(IF((Range>0)*(MOD(Range,5)=0),Range)

=SUMPRODUCT(--(Range>0)*(MOD(Range,5)=0),Range)

The SUM(IF(..)) formula must be entered as an array formula, but the SUMPRODUCT
formula needn't be. As an educated guess, this means the formula parser is
involved in evaluating SUM(IF(...)), but SUMPRODUCT itself can handle the entire
evaluation. Usually this means SUMPRODUCT will recalculate faster than
equivalent SUM(IF(..)) calls. On the other hand, if any of the criteria involve
error trapping, SUM(IF(..)) is the only sensible option. For example,

=SUM(IF(ISNUMBER(Range),IF(Range>0,Range)))

=SUMPRODUCT(IF(ISNUMBER(Range),--(Range>0),0),Range)

should return the same results if *both* were entered as array formulas. This is
a quirk of how Excel evaluates IF(..) in array contexts. If the IF function
should return an array result, the formula containing must *always* be entered
as an array formula. Entering SUMPRODUCT formulas as array formulas eliminates
its recalc speed, at which point SUM(IF(..)) is better because it involves less
typing.

So, when error trapping isn't an issue, SUMPRODUCT recalcs faster and doesn't
need to be entered as an array. Those are the benefits compared to SUM(IF(..)).
On the other hand, SUM(IF(..)) is more general and is the most practical way to
handle error trapping.
 
B

Bob Phillips

Harlan Grove said:
=SUMPRODUCT(IF(ISNUMBER(Range),--(Range>0),0),Range)

should return the same results if *both* were entered as array formulas. This is
a quirk of how Excel evaluates IF(..) in array contexts. If the IF function
should return an array result, the formula containing must *always* be entered
as an array formula. Entering SUMPRODUCT formulas as array formulas eliminates
its recalc speed, at which point SUM(IF(..)) is better because it involves less
typing.

Why not

=SUMPRODUCT(--(ISNUMBER(Range)),Range)
 
H

Harlan Grove

...
Why not

=SUMPRODUCT(--(ISNUMBER(Range)),Range)

Because the former sums only positive numbers while the latter sums all numbers
perhaps?! Am I the only person in these newsgroups who understands there could
be BOTH positive AND negative numbers in any arbitrary range?
 
B

Bob Phillips

Maybe, but I have also seen (at least) one post from you where the OP asked
for the min value in a range and you gave a formula for excluding <=0, so
maybe even you forget.

And in this case, then how about

=SUMPRODUCT(--(ISNUMBER(Range)),--(Range>0),Range)

--

HTH

Bob Phillips
... looking out across Poole Harbour to the Purbecks
(remove nothere from the email address if mailing direct)
 
H

Harlan Grove

...
...
And in this case, then how about

=SUMPRODUCT(--(ISNUMBER(Range)),--(Range>0),Range)
...

Because the IF(ISNUMBER(Range),...) was there to catch ERROR values. Run your
formula and both

=SUMPRODUCT(IF(ISNUMBER(Range),--(Range>0),0),Range)

=SUM(IF(ISNUMBER(Range),IF(Range>0,Range)))

against any range you want as long as it contains some error values and some
nonnumeric text. Actually, only the SUM(IF(..)) formula will return a number. My
SUMPRODUCT formula should have been either

=SUMPRODUCT(IF(ISNUMBER(Range),--(Range>0)),IF(ISNUMBER(Range),Range))

or

=SUMPRODUCT(IF(ISNUMBER(Range),IF(Range>0,Range)))

which makes it clearer that SUMPRODUCT ain't the thing to use to trap errors.
 
C

Charles Williams

To add to Harlan's comments:

A few weeks ago I finally did some speed tests on SUMPRODUCT vs array
formulae and on --.
My results were

there is only a few percent speed advantage but SUMPRODUCT with -- was the
winner.

If you do not need the -- and are using the native SUMPRODUCT function of
multiply and add using commas for things like weighted average calculations
then SUMPRODUCT was 20-25% faster than an equivalent array formula

Charles
______________________
Decision Models
The Excel Calculation Site.
www.DecisionModels.com
 
B

Bob Phillips

Well your definition of trap an error differs from mine. Trap an error to me
means highlight it, find it, not ignore it, which is what you SUMIF formula
does. My view is that if there is an error on the worksheet it should be
removed, either by correcting the formula, or if accepted that it a #N/A say
can arise, code that into the source formula. Ignoring them could hide all
manner of problems, leading to incorrect results.

I would have thought you, with your strong views on who should be allowed to
maintain spreadsheets and who shouldn't, would not have advocated such a
position.

Bob
 
H

Harlan Grove

Well your definition of trap an error differs from mine. Trap an error to me
means highlight it, find it, not ignore it, which is what you SUMIF formula
does. My view is that if there is an error on the worksheet it should be
removed, either by correcting the formula, or if accepted that it a #N/A say
can arise, code that into the source formula. Ignoring them could hide all
manner of problems, leading to incorrect results.

I would have thought you, with your strong views on who should be allowed to
maintain spreadsheets and who shouldn't, would not have advocated such a
position.
...

In general, finding #NAME?, #NULL!, #NUM! (except as [X|]IRR return values) and
#REF! errors is mandatory, and such errors should propagate through all other
formulas whenever possible. On the other hand, #VALUE!, #DIV/0! and #N/A are
'common' errors, especially #N/A, which is sometimes *necessary* in chart
ranges. Generally it's a good idea to have separate chart and calculation ranges
when chart ranges need to hold #N/A values, but sometimes there are old
fashioned memory usage considerations that argue against doing so.

As long as charts need #N/A rather than "" to simulate blank cells, FIND and
SEARCH return #VALUE! for no matches rather than 0 or -1, MATCH/[V|H|]LOOKUP
return #N/A for no matches rather than 0 or -1 (who thought up #VALUE! rather
than #N/A for FIND and SEARCH?), Excel can return *both* #VALUE! and 5 for
="abc"+5 depending on the Transition Formula Evaluation setting, ERROR.TYPE
returns #N/A rather than 0 or -1 when its argument *isn't* an error value, and
there's the @#$%&*! 7 nested function call limit, there will be occasions on
which it's just not expedient to fix errors where they arise. For those
situations, trapping errors with an IF call may be the best solution.

So, letting error values propagate when they SHOULD NOT OCCUR is sensible.
Trapping errors when they're not unlikely or even expected, as in *MANY* FIND,
SEARCH, MATCH, [V|H|]LOOKUP and [X|]IRR calls, is nothing exceptional, just a
part of ordinary spreadsheet development.

But you're right, I was lazy. If only #N/A should be trapped,

=SUM(IF(1-ISNA(Range),IF(Range>0,Range)))

If #DIV/0!, #VALUE! and #N/A should all be trapped but other errors propagate,

=SUM(IF(ISERROR(1/((ERROR.TYPE(Range)=2)+(ERROR.TYPE(Range)=3)
+(ERROR.TYPE(Range)=7))),IF(Range>0,Range)))
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top