Top Ten rules for using Sumproduct ?

  • Thread starter Thread starter JE McGimpsey
  • Start date Start date
J

JE McGimpsey

twaccess said:
Rule 1 All ranges within sumproduct function must be the same size.

change "ranges" to "arrays"

=SUMPRODUCT(A1:B2,{1,2;3,4})

works just fine even though it has only one range.
 
I continue to be fascinated by this function and have printed off copie
of Ken and Aladin's threads on this subject which make for interestin
reading.

However I feel a quick top ten list of simple rules would help me t
avoid making elementary mistakes etc, and maybe of help to others too.

Rule 1 All ranges within sumproduct function must be the same size.
Rule 2 Ranges cannot be whole columns, i.e. $A:$A, they must b
specific, i.e. $A2:$A200.
Rule 3 Data cannot be read in from ranges outside of the curren
spreadsheet. (This is one I made up and may be incorrect !)

What other 'simple' rules do others have to add to this ?

I would exclude syntax items here because they are well covered b
others like Ken Wright.

Terr
 
Hi Terry

Newsbeitrag [snip]
Rule 1 All ranges within sumproduct function must be the same size.
Rule 2 Ranges cannot be whole columns, i.e. $A:$A, they must be
specific, i.e. $A2:$A200.
Rule 3 Data cannot be read in from ranges outside of the current
spreadsheet. (This is one I made up and may be incorrect !)

You can use ranges outside the current sheet. e.g
=SUMPRODUCT(--('sheet2'$A$1:$A$100=2),'sheet2'!$B$1:$B$100)

What other 'simple' rules do others have to add to this ?
4. You have to coerce the boolean values to numbers with a mathematical
operation
5. '--' is more efficient (accroding to some discussions in this NG)
than multiplying the values


Frank
 
I'm not sure it really needs a top ten list.

SUMPRODUCT(arr1, arr2) is simply a constrained version of the
array-entered SUM(arr1*arr2).

It has only 3 advantages:

1) It doesn't have to be array-entered
2) It's significantly faster than an array-entered SUM()
3) It treats non-numeric entries as zero.

It has 1 disadvantage:
1) arrays must have the same dimension

It shares a limitation with SUM(arr1*arr2)
1) Array formulas cannot use entire columns or rows
 
Frank Kabel said:
5. '--' is more efficient (accroding to some discussions in this NG)
than multiplying the values

Whether more efficient or not (it had a small speed advantage in some
test I ran, but the tests weren't controlled), it does(a) use the
tightest binding of operator to operand, and (b) preserve SUMPRODUCT's
ability to treat non-numeric values as zero.
 
Another 'disadvantage'

If comparing an array against another array, the array compared against must
be horizontal. If it is vertical it has to be TRANSPOSed, which takes the
away the non-array advantage.

--

HTH

Bob Phillips
... looking out across Poole Harbour to the Purbecks
(remove nothere from the email address if mailing direct)
 
1) Array formulas cannot use entire columns or rows

Make that just columns :-)

=SUMPRODUCT(1:1,2:2) works fine

as does

{=SUM(1:1*2:2)}
 
twaccess > said:
I continue to be fascinated by this function and have printed off copies
of Ken and Aladin's threads on this subject which make for interesting
reading.

However I feel a quick top ten list of simple rules would help me to
avoid making elementary mistakes etc, and maybe of help to others too.

Rule 1 All ranges within sumproduct function must be the same size.

As stated (admittedly, often posited as such), constitutes a cognitive
obstacle...

If you are inclined to use the 'native' syntax of SumProduct, the following
would not work:

=SUMPRODUCT(A1:A4,B2:C4)

although the ranges involved are equally sized in vertical dimension. That
is, SizeOf(A1:A4)=SizeOf(B2:B4).

In such cases, one is advised to drop the 'native' comma syntax and invoke
instead the star syntax:

=SUMPRODUCT(A1:A4*B2:C4)

This formula will produce a #VALUE! error, when there is text (including a
formula-blank, i.e., "") in the ranges of interest, otherwise it's OK.

The rule can be amended to:

If the refs are of the same shape (equally sized in both dimensions), invoke
SumProduct using the comma-syntax.
If the refs are equally sized in one dimension, invoke SumProduct using the
star syntax.

Note also that a ref (reference argument) can be an array object (like
{1,3,5,7}) or a range object (like F2:F6).
Rule 2 Ranges cannot be whole columns, i.e. $A:$A, they must be
specific, i.e. $A2:$A200.

See http://tinyurl.com/3h724 as to the why of this rule.
Rule 3 Data cannot be read in from ranges outside of the current
spreadsheet. (This is one I made up and may be incorrect !)

This is indeed incorrect.
What other 'simple' rules do others have to add to this ?

I would exclude syntax items here because they are well covered by
others like Ken Wright.
[...]

You can't exclude the syntax issues if efficieny also matters, in particular
when you want a set of rules of thumb (or heuristic rules). A case in point
is how to OR...

Example:

Let A1:C10 house:
{"Jon",4,1;"Jon",7,1;"Damon",4,2;"Mete",5,1;"Carla",6,1;"Fra",7,2;"Xsa",6,2;
"Tra",8,1;0,6,2;"Qwa",3,1},
where "" stands for an empty cell.

=SUMPRODUCT((A1:A10={"Jon","Damon","Carla","Mete"})*(C1:C10=1)*B1:B10)

or equivalent

=SUMPRODUCT(--ISNUMBER(MATCH(A1:A10,{"Jon","Damon","Carla","Mete"},0)),--(C1
:C10=1),B1:B10)

where L1:O1 houses {"Jon","Damon","Carla","Mete"},

forces one to invoke the star syntax and is an expensive formula because of:

{TRUE,FALSE,FALSE,FALSE;
TRUE,FALSE,FALSE,FALSE;
FALSE,TRUE,FALSE,FALSE;
FALSE,FALSE,FALSE,TRUE;
FALSE,FALSE,TRUE,FALSE;
FALSE,FALSE,FALSE,FALSE;
FALSE,FALSE,FALSE,FALSE;
FALSE,FALSE,FALSE,FALSE;
FALSE,FALSE,FALSE,FALSE;
FALSE,FALSE,FALSE,FALSE}*
{TRUE;TRUE;FALSE;TRUE;TRUE;FALSE;FALSE;TRUE;FALSE;TRUE}*
{4;7;4;5;6;7;6;8;6;3}

while

=SUMPRODUCT(--ISNUMBER(MATCH(A1:A10,{"Jon","Damon","Carla","Mete"},0)),B1:B1
0)

involves a lesser amount of computations:

{1;1;1;1;1;0;0;0;0;0}*{1;1;0;1;1;0;0;1;0;1}*{4;7;4;5;6;7;6;8;6;3}

Given the foregoing, a pragmatic rule to add to your list would be:

If the number of criteria that is involved in a conditional > 2, then use
ISNUMBER/MATCH as in --ISNUMBER(MATCH(A1:A10,{"Jon","Damon","Carla","Mete"}.

Otherwise:

If one criteria, use = as in --(A1:A10="Mete"); if two criteria, use + as in
(A1:A10="Mete)+(A1:A10="Jon").

One thing to note though: Formulas with SumProduct have lots in common with
formulas that must be committed with control+shift+enter (often referred to
as array formulas).
 
JE McGimpsey said:
Whether more efficient or not (it had a small speed advantage in some
test I ran, but the tests weren't controlled), it does(a) use the
tightest binding of operator to operand, and (b) preserve SUMPRODUCT's
ability to treat non-numeric values as zero.

Fully agree. The pick order is: -- faster than +0 faster than *1,
although -- and +0 are very close.
Becasue of tight binding -- effects, it is less jumpy in speed profile than
+0 from run to run.
 
Methinks I'm just going to set up a new mail rule in OE

'Where the From Line contains people' = 'Aladin'

'Copy message' to folder 'Good stuff to keep'

All this dragging and dropping is wearing my little old arm out <vbg>
 
Hi

Make it a "top 1 list"

1) When trying to figure out the endless possibilities and combinations of
a SUMPRODUCT function - post on the microsoft.public.excel.misc newsgroup
and hope Aladin Akyurek, Bob Phillips, Frank Kabel, JE McGimpsey, Ken
Wright, Norman Harker or any of the other regular contributors or MVPs
respond :)

Cheers
JulieD
 
LOL - OK I confess, I got pulled up on exactly the same thing the other day in
one of the tek forums - for some reason it just seemed natural to write rows as
well as columns, so it kinda stood out for me <vbg>
 
A big thanks to everyone who replied today. I'll print this all off no
and add it to my bedtime reading list.

I understand that syntax is important for efficiency, but as a relativ
lay person, trying to get to grips with such a powerful function,
really wanted to try and get into the thinking behind how SUMPRODUC
works. In other words, try and walk before I can run.

Julie makes a very good point, just post my query on the newsgroup an
wait for an answer to come back from hopefully one of the experts
However, I do feel that this is one function that I want to get to kno
at least as well as my other favourites such as SUMIF & VLOOKUP to nam
just a couple. So, I think I'll persevere with this.

Frank, may I ask what is meant by your point.
- "You have to coerce the boolean values to numbers with a mathematica
operation"-

Thanks again to all.


Terr
 
Newsbeitrag [snip]
Frank, may I ask what is meant by your point.
- "You have to coerce the boolean values to numbers with a mathematical
operation"-

Hi Terry
lets give an example:
=SUMPRODUCT(--(range1="cond_1"),range2)

within the SUMPRODUCT function you have the comparison: range1 =
"Cond_1"

This alone would return either TRUE or FALSE. Toe make SUMPRODUCT work
with these results you have to convert these boolean results to a
number (TRUE = 1 and FALSE = 0)

To do this there're several mathematical operations. That is by using a
boolean value in a mathematical operation Excel automatically coerces
this boolean value to a number.
Quite common is the double minus (--) as in the formula above. You
could also use
=SUMPRODUCT(1*(range1="cond_1"),range2)
or
=SUMPRODUCT(0+(range1="cond_1"),range2)

but the double minus is the fastest way

Frank
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top