sumproduct range question



I am using SUMPRODUCT to do a table lookup/sum for me. For example I
have data below:

1 2
A color price
B red 500
C red 1000
D blue 750

If a use SUMPRODUCT((A2:A4=A2)*(B2:B4)) I get 1500 or
SUMPRODUCT((A2:A4=A4)*(B2:B4)) and I get 750 and I can use this but I
am having trouble understanding what the range argument represents. I
have tried just putting it in a cell (eg. =((A2:A4=A2)*(B2:B4)) ) but I
just get #VALUE! and if I put in just SUMPRODUCT((A2:A4=A4)) I get 0
while I would expect to get 1 or TRUE or perhaps the row number.

This seems like a powerful technique to use that I imagine has other
applications but I really do not understand the syntax.

What is going on here?







The formula:


in a single cell will not do much. The other formula,


is almost correct. What it does is it examines every cell in A2:A4
against A4 to see if it is equal or not. When it is equal you get TRUE.
So this formula is like (for your example):


Aggregate functions ignore logical values. However, if you get the
negative of the negative of a logical value, negation coerces T/F into
numbers and -- makes T=1 and F=0. Hence, to count how many times A4
value appears in A2:A4 you need:

=SUMPRODUCT(--(A2:A4=A4)) or

i.e. anything to coerce logical to numeric w/o really changing it. In
your original formula,


you are multiplying, hence you are forcing coercion to numeric. If you
had the two components as separate arguments then you would need to
coerce each one individually. The following formula is computationally
equivalent to yours:


Kostis Vezerides



Bob Phillips

Before you decide on which function to use, you need to determine what it
is that you want to do.

For instance, if you just wish to sum values based upon one single, simple
condition, then use SUMIF. If there are multiple conditions, or the
condition is too complex for SUMIF, then look at SUMPRODUCT, or
array-entered SUM(IF(...

The thing about all such functions is that they basically work on ranges or
arrays. If the range is independent that suggests it is being evaluated for
the values in the range. If the range is tested against another value, that
is to determine whether the condition is met or not. Unfortunately, a met
condition returns TRUE, an unmet condition returns FALSE, which is not of
much value in doing math, so it is necessary to coerce these values. That is
where the * operator comes into play, or the double unary --.


Bob Phillips

(replace somewhere in email address with gmail if mailing direct)

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question