How to search for nearest nonblank cell in previous rows?

J

joeu2004

What expression can I use to search previous rows for the nearest
nonblank cell in a specified column (or the current column), then
return a reference to a cell in another column in that row?

Whew, that was a mouthful! See the example below.

I know that I can accomplish what I need by using helper columns. But
I wonder if I can make do without them.

Forgive me if this question has been asked and answered a million
times. I think it has. But I am unable to find the thread at the
moment.

Here is a specific example. Forgive me if the columns (row number and
columns A-D) do not align well.

1: 12/31/2006 101,000 1000 12.00%
2: 1/12/2007 99,000
3: 1/20/2007 110,000
4: 1/31/2007 110,100 (formula) 11.50%
5: 2/3/2007 111,000
6: 2/15/2007 110,000
7: 2/17/2007 110,100
8: 2/28/2007 111,200 (formula) 13.00%

If I were writing the formulas individually, they would be:

C4: =roundup(sumproduct(A2:A4-A1:A3,B1:B3)*D4/365,-2) [= 1100]
C8: =roundup(sumproduct(A5:A8-A4:A7,B4:B7)*D8/365,-2) [= 1200]

I would like a single formula, which I copy into C4, C8 etc. It would
synthesize A2, A1 and B1 in C4, for example, based on the fact that D1
(or C1) is nonblank. (Alternatively, the formula could synthesize the
ranges A2:A4, A1:A3 and B1:B3.) Perhaps the solution would be
structured something like this:

C4: =roundup(sumproduct(offset("search from D3; finds D1",1,-3):A4-
offset("search from D3; finds D1",0,-3):A3,offset("search from D3;
finds D1",0,-2):B3)*D4/365,-2)

C8: =roundup(sumproduct(offset("search from D7; finds D4",1,-3):A8-
offset("search from D7; finds D4",0,-3):A7,offset("search from D7;
finds D4",0,-2):B7)*D8/365,-2)

Despite the obvious inefficiency, I would prefer a solution that
requires only Excel worksheet functions (including ATP functions, if
necessary) in an Excel formula.

(Ah, but what a mess!)

Alternatively, I would consider a VBA solution. I imagine that could
even be much more efficient if the SUMPRODUCT can be done within the
VBA function, since only one search would be required, I assume. But
I don't know how to do that.

Note: The solution can assume that there will be a nonblank cell in
column D (or C) in the first row of the table. Thus, no error-
checking is required.

TIA.
 
J

joeu2004

What expression can I use to search previous rows for the nearest
nonblank cell in a specified column (or the current column), then
return a reference to a cell in another column in that row?

Forgot to mention: I am using Office Excel 2003.
 
L

Lori

Maybe this in C4, and copy down to other rows...

=ROUNDUP(SUMPRODUCT(--(MATCH(9E+99,C$1:C3)<=ROW(C$1:C3)),A$2:A4-A$1:A3,B$1:B3)*D4/365,-2)
 
J

joeu2004

Maybe this in C4, and copy down to other rows...

Well said:
=ROUNDUP(SUMPRODUCT(--(MATCH(9E+99,C$1:C3)
<=ROW(C$1:C3)),A$2:A4-A$1:A3,B$1:­B3)*D4/365,-2)

Thanks. That does seem to work. But I do not understand it would. Can you
(or someone) explain the theory of operation?

I only need an explanation of the MATCH() usage. I understand that if
MATCH() finds row x, then --(x<=row(C$1:C3)) expands into an array of false
(0) for rows below x and true (1) for rows at and above x; and that array of
0s and 1s effectively cancels out uninteresting products of
(A$2:A4-A$1:A3)*(B$1:B3).

But according to the help page, in that form, MATCH() will search from C$1
to the row above (C3, in this case) for the __largest__ number less than or
equal to 9E+99. In this form, MATCH() expects the lookup-array to be in
ascending order.

That is true for the column C in my example; but that is only by
coincidence. Moreover, I prefer to search column D, where the figures
clearly are not in ascending order. Finally, I am not interested in the
largest number, but in the nearest number above.

That said, I am surprised to find that using a lookup-array of random
percentages (with a random number of interstitial blanks, at least 1), the
MATCH() usage as you wrote it (substituting column D for column C) does work
exactly as I need it. That is, MATCH() always returns the nearest number
above, not the largest number in the array.

Why is that? Is it only an accident of implementation?

(I would expect that MATCH() uses a binary search algorithm, which normally
starts with the middle element. But unpredictable things happen in a binary
search if the lookup-array is not sorted. Nonetheless, I would be surprised
that it always determines that the last non-zero element of the array meets
the criteria.)

Thanks for any insight.

That said, I do like your structure of the SUMPRODUCT(), performing the
look-up only once. So if MATCH() does not really work in the final analysis,
I believe I am looking for a look-up function that simply searches backwards
sequentially -- or some other clever usage that has that effect.

Thanks again.
 
J

joeu2004

Errata....

I would expect that MATCH() uses a binary search algorithm

No reason for me to make that assumption, at in this particular case.
I would hope that MATCH() uses a binary search (or better) for large
ranges. But for small ranges, a linear search would do just fine. I
did not pay any attention to the relative values of the __two__
nearest nonblank cells. If MATCH() happens to search backwards
linearly (at least for small ranges) and if the nearest nonblank just
happened to be larger than the 2nd nearest nonblank, Lori's MATCH()
would appear to work, but only by coincidence.

However, Lori's MATCH() always seems to work (i.e. it finds the
nearest nonblank value) even if the 2nd nearest nonblank cell is
larger than the nearest.

I am still curious why it works.

However, Lori's structure of the SUMPRODUCT() does make it very clear
how I would design my own VBA function to find the nearest nonblank
cell and return the row number. Thanks again for that.

But I am still interested in a non-VBA solution. If Lori's MATCH()
solution is it, I would still appreciate an explanation, in light of
the help page description (that is, my understanding of it).

Thanks again, Lori, for a very clean approach, if not the solution.


, which normally
 
J

joeu2004

If MATCH() happens to search backwards
linearly (at least for small ranges) and if the nearest nonblank just
happened to be larger than the 2nd nearest nonblank, Lori's MATCH()
would appear to work, but only by coincidence.

Oh, that's probably it exactly. If MATCH() does a linear backward
search (at least for small ranges), it only has to look at the nearest
nonblank cell because that will be the first value less than 9E+99,
which is presumed to be largest value less than 9E+99 because the
range is presumed to be in ascending order for a type 1 search.

But can we really depend on MATCH() to do a linear search for all
ranges? Where is that documented? I don't see it on the help page.
Did I miss it?

(Sorry for the incessant postings.)
 
J

joeu2004

Oh, that's probably it exactly. If MATCH() does a linear backward
search (at least for small ranges), it only has to look at the nearest
nonblank cell because that will be the first value less than 9E+99,
which is presumed to be largest value less than 9E+99 because the
range is presumed to be in ascending order for a type 1 search.

But can we really depend on MATCH() to do a linear search for all
ranges? Where is that documented?


MATCH can always be used this way to find the last value, there
are many references in newsgroups, also see Bob Philip's article:
http://www.xldynamic.com/source/xld.LastValue.html

Thanks, Lori. But the fact that many people, even experts, rely on
undocumented behavior -- worse, behavior that seems contrary to MS
documentation -- does not make it right, in my book.

For example, MATCH(...,-1) cannot be used to find the first value in
an unsorted array. It returns an error if the array is not sorted
properly, as documented. It seems odd to me that MATCH(...,1) does
not return a similar error. (But it is true that the MS help page
does not say it should.)

The fact that this usage of MATCH(...,1) is so widely mentioned in
expert documentation might discourage MS from "improving" MATCH() in
such a way that this usage fails in the future. However, I would feel
more comfortable with a dispositive statement that MS supports this
dependency on the current implementation of MATCH(...,1).

PS: As a system designer, I was also concerned by people's reliance
on undocumented behavior. It tied my hands when I wanted to improve
things. I can easily foresee some hapless MS Excel developer deciding
to improve MATCH() performance on large ranges by changing the
(apparent) linear search to a binary search -- or by fixing the
"inconsistency" that MATCH(...,1) does not return an error when
MATCH(...,-1) does. That would break the use of MATCH(...,1) to find
the last value in an unsorted range.
 
J

joeu2004

PPS....

PS:  As a system designer, I was [always] concerned by people's reliance
on undocumented behavior.

But as a user, I will probably use your suggetion <g>. It avoids
using a VBA function. And it is comforting to know that experts have
documented this as a way to do it.

Thanks again.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top