Increasing the speed of Sumproduct

H

Hari

Hi,

1. Just read http://www.mcgimpsey.com/excel/formulae/doubleneg.html

In this JEM says that we have double negs so that
=SUMPRODUCT(--(A1:A5>10),B1:B5))
can be coerced in to 1. As per JEM"s explanation single unary will coerce
True/False to Zero/One and the second double unary is used so that the
negative values could be converted to its original sign. My "reasoning" was
instead of using double negative sign why not use a single + sign and
achieve further speed increase.

So I did this =SUMPRODUCT(--(A1:A5>5),--(A1:A5<10),B1:B5) . The data I used
was
A B
2 3
3 4
11 5
5 6
6 10


SADLY, got the answer as Zero. I used evaluate formula feature to see why
excel wasnt coercing true, false to 1,0 by seeing the "+" sign.
But couldnt understsand it. Why cant a single plus sign achieve that?

2.(I did not accept defeat with the above and now tried a diff. approach.).
Suppose the number of conditions to evaluate is 2. Like above we are
checking for 2 conditions A1:A5>5 and A1:A5<10 and in all using 4 negs. I
think it can be reduced to 2 negs in the following way..

=SUMPRODUCT(-(A1:A5>5),-(A1:A5<10),B1:B5). This way when we have EVEN number
of conditions then use the unary operator only once for each condition so
that net effect of minus cancels out.

What if we have ODD number of conditions.? For that, for one condition we
will keep it as double negs and for the rest it will be single negs.

I think that there could be slightly more tweaking of even ODD condition.
Just lie EVEN case for all conditions take it as single neg and then in
front of sumproduct put a Negative sign in order to get the correct result
for the final evaluation.

Like suppose we have to check for 3 conditions, with A1:A5>5, A1:A5<10 and
A1:A5> average(A1:A5)

then write the formula as
= -SUMPRODUCT(-(A1:A5>5),-(A1:A5<10),-(A1:A5> Average(A1:A5)),B1:B5).

Am I talking sense here?

Regards,
Hari
India
 
K

kfotedar

Yaar, Hari! Thanks a lot for your post...am a great fan of th
sumproduct function and have been using it very often in my financia
and budgeting models.

That link explaining the "--" operator was excellent. I've always go
frustrated with Sumproduct's inability to handle text..but the doubl
unitary operator takes care of that, only thing is we have to use ",
instead of "*".

Happy Independence Day! :cool: SAARE JAHAN SE ACCHA HINDUSTAN HAMARA!

Jai Hind!

Kavi
 
F

Frank Kabel

Hi Hari
to answer some of your questions:
1. a single '+' is not a mathematical operation. So Excel just 'skips'
the '+' in front of the boolean values and no coercion will take plase.
So you have to use either
--(...),
0+(...)
(...)^1
to create a real mathematical operation

2. Though you're correct with replacing the double minus with a single
minus in case of an even number of conditions (and using a single
double minus in case of an odd number of conditions) I would NOT do
thisfor the following reasons:
- I doubt you'll recognize any speed advantage in real life
- This is not very robust if you miss a single minus or just got the
number of conditions wrong.
So in total: as you won't gain (IMHO) a significant speed gain just use
the double minus (unaray) operator :)
 
E

ElsiePOA

Hi Hari -

I'm not a sumproduct expert and can't give you advice regarding singl
as opposed to double unary.

However, just out of curiosity, I copied your original formula and dat
to a worksheet and my answer was 10. Are you sure you entered i
correctly
 
H

Hari

Hi Frank,

Thanx for your explanation regarding point 1.

Actually I have a file in which there are lots and lots of Sumproduct's. As
of today close to 50000 and it is made up of multiple condition checking
based on data from another worksheet. The no of sumproducts formulas
increase by 250 every day (as one of the conditions is based on date so with
each new day the range of formula copying also increases).

Now the above file is for one site and one project of our company. I have
similar format of files for every project in every site which amounts to 20
more.

A single file takes around 25 minutes to calculate in a P4 1.6 GHz, 256 MB
RAM, Win 2000. Size of a single file is 7.5 MB as of today

Presently I set the calculation to manual before opening this file. Problem
happens when I mail/share this file with others.Because unsuspectingly they
open the file and excel hangs as their calculation would be in automatic. I
understand that even if i change the formula there may not be much of an
advantage in speed gains (inspite of the large no of sumproducts). Frankly
speaking I havent done any speed test on what happens If I change the
formula. ( due to some issues..)

As I said that Im having problem in sharing this file with other users of
the output from this file. I was thinking what if i write the formulas in VB
code and then write the end results of those formula evaluation in to the
cells. That way only when the Macro is run will the calculation happen. This
way anybody who opens the file will not be caught unawares.

But I see one problem with this approach of using VB code and that is
"auditing" of formula will become "difficult". I mean that there are quite a
number of people who will be comfortable in seeing the formula before their
eyes and see whether the ranges and conditions are correct. But if I use the
VB code, then people will not have much faith in the output (also majority
arent comfortable with VB etc).

So, I wanted to know whether u could offer some perspective/ideas on this.

Regards,
Hari
India
 
F

Frank Kabel

Hi
having so many calculations in one workbook would lead me to thinking
that maybe Excel is not the right tool for this. As I don't know what
you're trying to do with this it is difficult to say though.

But IMHO shared Excel files are nearly always not a good idea :)
 
H

Hari

Hi Frank,

I think ur right about Excel not being the right tool. May be Access, only
problem is I dont know access and think it would take too much time for me
to catch up...

Anyway thanx for ur help.

(Actually I havent set the file in to shared mode. When im through with
dumping the data and "running" the sumproducts and getting the output I mail
it to those who need to use the output/results)

Regards,
Hari
India
 
H

Hari

Hi Elsie,

I meant to write it as =SUMPRODUCT(+(A1:A5>5),+(A1:A5<10),B1:B5) but slip of
my finger ended up writing it as
=SUMPRODUCT(--(A1:A5>5),--(A1:A5<10),B1:B5).

Regards,
Hari
India
 
J

JE McGimpsey

I would think that your spreadsheet could be made significantly more
efficient, though not necessarily efficient enough to matter.

Techniques such as intermediate calculations (e.g., if multiple
SUMPRODUCTs depend on the same conditions, precalculating the conditions
may save recalcs), dynamic ranges where appropriate, etc. can
significantly reduce calc time. See

http://www.decisionmodels.com/optspeed.htm
 
H

Hari

Hi Frank,

On second thought ...

I understand that Unary operator was "introduced" so that number of
calculation steps reduce by one for each condition as compared to normal *.

If that reduction in time was considered reasonable enough to warrant a
change then why are u saying that skipping one unary would "I doubt you'll
recognize any speed advantage in real life"

I understand ur point about robustness but not able to appreciate "if you
miss a single minus or just got the number of conditions wrong." I believe
that even if somebody misses a double minus in case of ODD number of
conditions due to wrong enumeration etc. then ideally it shouldnt matter
much as Sumproduct would yield a Negative answer and I believe that if we
are doing a strictly "count" cases kind of job then the count has to be
positive so wouldnt a negative answer be a flag for the person writing the
formula.

Regards,
Hari
India
 
F

Frank Kabel

Hi Hari
this may be true for conditional counts but is not that obvious for
conditional sums (as they could be of course negative). But again: I
think you probably won't see that much difference regarding speed (my
assumption -> but you may try it on your large sheet).

In addition also the 'speed advantage' of the unary operator is not
really recognizable in most real-life spreadsheets (maybe this is also
different for your spreadsheet with >5000 Sumproduct formulas). For me
it is more a personal taste and before I would try to optimise these
kind of formulas I would consider different approaches (like JE
presented in his post).

Regarding your specific spreadsheet: Have you considered using pivot
tables. In some cases they could be a replacement for Sumproduct
formulas
 
H

Hari

Hi Frank,

Thanx for reply.

I have actually tried using JE's approach whereby common conditions are
calculated only once for that row and . But due to the NATURE of DATA and
CONDITION the file size was growing and growing without a limit (20 MB and
above). Charles taught me this method of grouping the conditions together
around 2 months back and I was very enthused initially but had to discard it
due to file size. (Though in the resulting file speed was better I chose to
not change). For me the compromise of speed is better than file size (as the
file has to be mailed etc).

Pivot had somewhat a similar problem of file size.

Now, I think I will turn to VB and "Dump" the result directly in to the
respective cells.

(I have 50000 SP's not 5000..)

Regards,
Hari
India
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top