VBA find approximate match

G

Guest

Hi,
I am trying to mimic the approximate find methods used in Excel MATCH,
VLOOKUP, etc. application functions in VBA. I have a large(!) data set and,
in this particular instance, want to perform operations (average, max, min)
on subranges containing data for different periods. The data is arranged in
ascending order and I am using the date and time column of the data to select
my subranges. However, although the data is usually equally spaced, there is
a small tolerances (no doubt due to rounding operations) and some date and
time values will be missing. As I am performing this action thousands of
times in the code, I do not want to resort to Application.VLOOKUP or
Application.Match. I cannot, however, find any adjustments to the VBA FIND
method that will allow me to check for a match within a certain tolerances
(say within five minutes of my target date and time). What I would really
like is to find the cell in the column with and exact or largest value that
is less than my target (and I can then check the tolerances separately).
Although I am sure that this is a common requirement, I have been unable to
find anything to allow me to do this in an efficient manner.
Any help, clues or suggestions would be most gratefully appreciated.
 
N

Niek Otten

Application.Match on a sorted table, approximate search, is very fast. A self developed binary search can be faster, it seems to
depend on several parameters and I've seen cases where it was slower. Anyway, read the whole table into a VBA array and search in
that array; much faster than looking in the worksheet.

--
Kind regards,

Niek Otten
Microsoft MVP - Excel


| Hi,
| I am trying to mimic the approximate find methods used in Excel MATCH,
| VLOOKUP, etc. application functions in VBA. I have a large(!) data set and,
| in this particular instance, want to perform operations (average, max, min)
| on subranges containing data for different periods. The data is arranged in
| ascending order and I am using the date and time column of the data to select
| my subranges. However, although the data is usually equally spaced, there is
| a small tolerances (no doubt due to rounding operations) and some date and
| time values will be missing. As I am performing this action thousands of
| times in the code, I do not want to resort to Application.VLOOKUP or
| Application.Match. I cannot, however, find any adjustments to the VBA FIND
| method that will allow me to check for a match within a certain tolerances
| (say within five minutes of my target date and time). What I would really
| like is to find the cell in the column with and exact or largest value that
| is less than my target (and I can then check the tolerances separately).
| Although I am sure that this is a common requirement, I have been unable to
| find anything to allow me to do this in an efficient manner.
| Any help, clues or suggestions would be most gratefully appreciated.
 
G

Guest

Hi Niek,
Thanks for your very fast reply.
My data set is typically 35000 rows and at least three columns. I am rather
wary of reading that as an array in to VBA though I am well aware that VBA is
much faster at accessing internally stored data. Are you aware of a good way
of reading in even such large data sets? If not, I guess could perhaps do it
in junks...
I will try Application.Match (though I had read that Find is up to 100 times
faster) but am surpirsed that there is no vba built in method for approximate
finds (I have been trying to achieve the desired result by playing around
with the date formatting and converting the result to a string for the
comparison but...).
Thanks gain for your response - and if you can let me know how to read in
large data sets efficiently, that would be really handy.
Best wishes, Boris.
 
B

Bob Phillips

Just load it

Dim ary

ary = Range("A1:A3500")

To use an approximate match, the data does need to be sorted.

--

HTH

Bob Phillips

(replace xxxx in the email address with gmail if mailing direct)
 
G

Guest

Dear Bob,
Well, that's much better then for j = 1 to 35000 ary(j) = etc etc!
I'll try that and see what Excel makes of the availble resources.
Many tanks and best wishes, Boris.
 
N

Niek Otten

< Are you aware of a good way of reading in even such large data sets?>

The way Bob suggested; doing that 100 times took 3 seconds

--
Kind regards,

Niek Otten
Microsoft MVP - Excel

| Hi Niek,
| Thanks for your very fast reply.
| My data set is typically 35000 rows and at least three columns. I am rather
| wary of reading that as an array in to VBA though I am well aware that VBA is
| much faster at accessing internally stored data. Are you aware of a good way
| of reading in even such large data sets? If not, I guess could perhaps do it
| in junks...
| I will try Application.Match (though I had read that Find is up to 100 times
| faster) but am surpirsed that there is no vba built in method for approximate
| finds (I have been trying to achieve the desired result by playing around
| with the date formatting and converting the result to a string for the
| comparison but...).
| Thanks gain for your response - and if you can let me know how to read in
| large data sets efficiently, that would be really handy.
| Best wishes, Boris.
|
| "Niek Otten" wrote:
|
| > Application.Match on a sorted table, approximate search, is very fast. A self developed binary search can be faster, it seems
to
| > depend on several parameters and I've seen cases where it was slower. Anyway, read the whole table into a VBA array and search
in
| > that array; much faster than looking in the worksheet.
| >
| > --
| > Kind regards,
| >
| > Niek Otten
| > Microsoft MVP - Excel
| >
| >
| > | Hi,
| > | I am trying to mimic the approximate find methods used in Excel MATCH,
| > | VLOOKUP, etc. application functions in VBA. I have a large(!) data set and,
| > | in this particular instance, want to perform operations (average, max, min)
| > | on subranges containing data for different periods. The data is arranged in
| > | ascending order and I am using the date and time column of the data to select
| > | my subranges. However, although the data is usually equally spaced, there is
| > | a small tolerances (no doubt due to rounding operations) and some date and
| > | time values will be missing. As I am performing this action thousands of
| > | times in the code, I do not want to resort to Application.VLOOKUP or
| > | Application.Match. I cannot, however, find any adjustments to the VBA FIND
| > | method that will allow me to check for a match within a certain tolerances
| > | (say within five minutes of my target date and time). What I would really
| > | like is to find the cell in the column with and exact or largest value that
| > | is less than my target (and I can then check the tolerances separately).
| > | Although I am sure that this is a common requirement, I have been unable to
| > | find anything to allow me to do this in an efficient manner.
| > | Any help, clues or suggestions would be most gratefully appreciated.
| >
| >
| >
 
G

Guest

Dear Niek, dear Bob,
.... which begs the next question: how do I (efficently) search in my newly
created arrays, preferably using an application.match or vba find type of
function (the examples have all search through ranges). I guess I can pull
out my best guess candidate and then sequentially reduce or increase) the
element position position unit I get to the right one but is there a built in
function for thsi?
Thanks again to both of you for your very helpful comments.
Best wishes, Boris.
 
B

Bob Phillips

MsgBox Application.Match(somevalue, ary, 0)

Do note the caveat that both Niek and I mentioned, the data must be sorted.

Form help

MATCH(lookup_value,lookup_array,match_type)

Lookup_value is the value you use to find the value you want in a table.

a.. Lookup_value is the value you want to match in lookup_array. For
example, when you look up someone's number in a telephone book, you are
using the person's name as the lookup value, but the telephone number is the
value you want.


b.. Lookup_value can be a value (number, text, or logical value) or a cell
reference to a number, text, or logical value.

Lookup_array is a contiguous range of cells containing possible lookup
values. Lookup_array can be an array or an array reference.

Match_type is the number -1, 0, or 1. Match_type specifies how Microsoft
Excel matches lookup_value with values in lookup_array.

a.. If match_type is 1, MATCH finds the largest value that is less than or
equal to lookup_value. Lookup_array must be placed in ascending order:
....-2, -1, 0, 1, 2, ..., A-Z, FALSE, TRUE.


b.. If match_type is 0, MATCH finds the first value that is exactly equal
to lookup_value. Lookup_array can be in any order.


c.. If match_type is -1, MATCH finds the smallest value that is greater
than or equal to lookup_value. Lookup_array must be placed in descending
order: TRUE, FALSE, Z-A,...2, 1, 0, -1, -2,..., and so on.


d.. If match_type is omitted, it is assumed to be 1.

--

HTH

Bob Phillips

(replace xxxx in the email address with gmail if mailing direct)
 
N

Niek Otten

<I guess I can pull out my best guess candidate and then sequentially reduce or increase>

Don't bother. Just use a binary search. I tested the one below on a 35000*3 array. One million searches took 7 seconds.

You'll have to change it slightly to incorporate your tolerance.

Or just use the worksheet functiions, as both Bob and I suggested.

I picked this one from a newgroup, but unfortunately can't remember who wrote it.
I changed it for just numbers and a few other minor things, so do test it!

===============================================================
Option Explicit

Function BSearchNumber(vRng(), vItem As Double)
' --------------------------------------------------
' Like BSearch, but only for numbers
' DESCRIPTION: Perform a binary search on an (assumed)
' sorted list
'
' RETURNS: The record/row number in which
' a match was found or #VALUE if no
' match found
' --------------------------------------------------
Dim FirstRec As Long, LastRec As Long
Dim TestRec As Long, LastTest As Long
Dim VarFound As Double

LastTest = -1
FirstRec = 1
LastRec = UBound(vRng)

Do
TestRec = Int((FirstRec + LastRec + 1) / 2)

If TestRec = LastTest Then Exit Do

LastTest = TestRec

VarFound = vRng(TestRec, 1)

If VarFound = vItem Then
BSearchNumber = TestRec
Exit Function
End If

If VarFound < vItem Then
FirstRec = TestRec
Else
LastRec = TestRec
End If
Loop

' If here then value not found
BSearchNumber = CVErr(xlErrValue)

End Function
===============================================================


--
Kind regards,

Niek Otten
Microsoft MVP - Excel

| Dear Niek, dear Bob,
| ... which begs the next question: how do I (efficently) search in my newly
| created arrays, preferably using an application.match or vba find type of
| function (the examples have all search through ranges). I guess I can pull
| out my best guess candidate and then sequentially reduce or increase) the
| element position position unit I get to the right one but is there a built in
| function for thsi?
| Thanks again to both of you for your very helpful comments.
| Best wishes, Boris.
|
| "Niek Otten" wrote:
|
| > < Are you aware of a good way of reading in even such large data sets?>
| >
| > The way Bob suggested; doing that 100 times took 3 seconds
| >
| > --
| > Kind regards,
| >
| > Niek Otten
| > Microsoft MVP - Excel
| >
| > | Hi Niek,
| > | Thanks for your very fast reply.
| > | My data set is typically 35000 rows and at least three columns. I am rather
| > | wary of reading that as an array in to VBA though I am well aware that VBA is
| > | much faster at accessing internally stored data. Are you aware of a good way
| > | of reading in even such large data sets? If not, I guess could perhaps do it
| > | in junks...
| > | I will try Application.Match (though I had read that Find is up to 100 times
| > | faster) but am surpirsed that there is no vba built in method for approximate
| > | finds (I have been trying to achieve the desired result by playing around
| > | with the date formatting and converting the result to a string for the
| > | comparison but...).
| > | Thanks gain for your response - and if you can let me know how to read in
| > | large data sets efficiently, that would be really handy.
| > | Best wishes, Boris.
| > |
| > | "Niek Otten" wrote:
| > |
| > | > Application.Match on a sorted table, approximate search, is very fast. A self developed binary search can be faster, it
seems
| > to
| > | > depend on several parameters and I've seen cases where it was slower. Anyway, read the whole table into a VBA array and
search
| > in
| > | > that array; much faster than looking in the worksheet.
| > | >
| > | > --
| > | > Kind regards,
| > | >
| > | > Niek Otten
| > | > Microsoft MVP - Excel
| > | >
| > | >
| > | > | Hi,
| > | > | I am trying to mimic the approximate find methods used in Excel MATCH,
| > | > | VLOOKUP, etc. application functions in VBA. I have a large(!) data set and,
| > | > | in this particular instance, want to perform operations (average, max, min)
| > | > | on subranges containing data for different periods. The data is arranged in
| > | > | ascending order and I am using the date and time column of the data to select
| > | > | my subranges. However, although the data is usually equally spaced, there is
| > | > | a small tolerances (no doubt due to rounding operations) and some date and
| > | > | time values will be missing. As I am performing this action thousands of
| > | > | times in the code, I do not want to resort to Application.VLOOKUP or
| > | > | Application.Match. I cannot, however, find any adjustments to the VBA FIND
| > | > | method that will allow me to check for a match within a certain tolerances
| > | > | (say within five minutes of my target date and time). What I would really
| > | > | like is to find the cell in the column with and exact or largest value that
| > | > | is less than my target (and I can then check the tolerances separately).
| > | > | Although I am sure that this is a common requirement, I have been unable to
| > | > | find anything to allow me to do this in an efficient manner.
| > | > | Any help, clues or suggestions would be most gratefully appreciated.
| > | >
| > | >
| > | >
| >
| >
| >
 
G

Guest

Dear Bob,
Of course! Sorry for asking such a trivial question... by the way, did you
see the binary search function that Niek posted in one of his replies to my
query? That will be extremely useful too.
Many thanks and best wishes, Boris.
 
G

Guest

Dear Niek,
Great function! Theee I was using long and complicated if... then...elseif
routines for binary searching but this will be fantastic.
Many thanks to you and Bob for your extremely helpful replies.
Best wishes, Boris.
 
G

Guest

Dear Niek
One last thing (honest)... Bsearch... as this is not a VBA buit in method,
do you happen to have the code for that too? I assume it is to perform a
binary search amongst string variables (from the very short hint about it in
bsearchnumber) or perhaps it has even greater flexibility. If you happen to
have the code and it is acceptable to post it, I would be very intrested in
that function too.
Many thanks again for all your help.
Best wishes, Boris.
 
B

Bob Phillips

Boris,

This is a very common computer algorithm, I first used it back in the 70s.
The technique essentially takes the middle item and looks if you value is
greater or less than. If greater, it takes the middle of that half, else it
takes the middle of the second half, and so on.

I googled this and found this. It is longer than the number search, more
work to do

Function BSearch(vIn As Variant, vitem As Variant) As Long
' --------------------------------------------------
' DESCRIPTION: Perform a binary search on an (assumed)
' sorted list
'
' RETURNS: The record/row number in which
' a match was found or #VALUE if no
' match found
' --------------------------------------------------
Dim iFirstRec As Long, iLastRec As Long
Dim iTestRec As Long, iLastTest As Long
Dim vrng As Variant


vrng = vIn ' This pointless assignment appears necessary!


If Not IsArray(vrng) Then
BSearch = CVErr(xlErrValue)
Exit Function
End If


If UBound(vrng, 2) <> LBound(vrng, 2) Then
' Vectors only please!
BSearch = CVErr(xlErrValue)
Exit Function
End If


' First deal with the simple cases


If vrng(LBound(vrng), 1) > vitem _
Or vrng(UBound(vrng), 1) < vitem Then
BSearch = CVErr(xlErrValue)
Exit Function
End If


If vrng(LBound(vrng), 1) = vitem Then
BSearch = LBound(vrng, 1)
Exit Function
End If


If vrng(UBound(vrng), 1) = vitem Then
BSearch = UBound(vrng, 1)
Exit Function
End If


iLastTest = -1
iFirstRec = LBound(vrng, 1)
iLastRec = UBound(vrng, 1)


Do
iTestRec = Int((iFirstRec + iLastRec) / 2)


If iTestRec = iLastTest Then Exit Do


iLastTest = iTestRec


If vrng(iTestRec, 1) = vitem Then
BSearch = iTestRec
Exit Function
End If


If vrng(iTestRec, 1) < vitem Then
iFirstRec = iTestRec
Else
iLastRec = iTestRec
End If
Loop


' If here then value not found
BSearch = CVErr(xlErrValue)


End Function


--

HTH


Bob Phillips

(replace xxxx in the email address with gmail if mailing direct)
 
N

Niek Otten

Here's the code. Note that is meant to be called from a worksheet. It's not too difficult to make it handle VBA arrays; compare
with Bsearchnumber.
==========================================
Option Compare Text
Option Explicit

Function BSearch(vRng As Range, vItem As Range) As Long
' --------------------------------------------------
' DESCRIPTION: Perform a binary search on an (assumed)
' sorted list
'
' RETURNS: The record/row number in which
' a match was found or #VALUE if no
' match found
' --------------------------------------------------
Dim FirstRec As Long, LastRec As Long
Dim TestRec As Long, LastTest As Long
Dim VarFound

LastTest = -1
FirstRec = 1
LastRec = vRng.Count

Do
TestRec = Int((FirstRec + LastRec + 1) / 2)

If TestRec = LastTest Then Exit Do

LastTest = TestRec

VarFound = vRng(TestRec, 1)

If VarFound = vItem Then
BSearch = TestRec
Exit Function
End If

If VarFound < vItem Then
FirstRec = TestRec
Else
LastRec = TestRec
End If
Loop

' If here then value not found
BSearch = CVErr(xlErrValue)

End Function
==========================================


--
Kind regards,

Niek Otten
Microsoft MVP - Excel


| Dear Niek
| One last thing (honest)... Bsearch... as this is not a VBA buit in method,
| do you happen to have the code for that too? I assume it is to perform a
| binary search amongst string variables (from the very short hint about it in
| bsearchnumber) or perhaps it has even greater flexibility. If you happen to
| have the code and it is acceptable to post it, I would be very intrested in
| that function too.
| Many thanks again for all your help.
| Best wishes, Boris.
|
| "Niek Otten" wrote:
|
| > <can't remember who wrote it>
| >
| > It was Peter Beach
| >
| > --
| > Kind regards,
| >
| > Niek Otten
| > Microsoft MVP - Excel
| >
| >
| >
 
G

Guest

Dear Bob,
Many thanks. I know the theory of binary searches and even the slightly more
complex fibannacci approach but, to date, was manually coding the sequence
with If... then...elseif statements which can become a bit of a headache...
I will certainly try this code in future.
Best wishes, and many many thanks, Boris.
 
G

Guest

Dear iek,
Brilliant. That will come in very handy! No more writing hundreds of
if...then...else sequences to sort through my binary searches (although I
understand that, for truely random distributions, a fibanacci sequence is
likely to be even more efficient...)
Many many thanks for all your help.
Best wishes, andhave a nice weekend, Boris.
PS Still struggling with the Range.Find method but it seems to be quite a
pain when using dates - just doing it 'cause I do not like to be beaten by
the software...
 
G

Guest

Dear Niek, dear Bob
After all your help, I am still struggling!
I have now tried setting up 2 large arrays, one for the date/time and one
for my data. My problem is this: if I set my array up to a specific size and
type (Dim TimeSArray(34944) as double), I get the error that I cannot set the
array to a range (a6:a34950) (the error is a compile error "can't assign to
array"). If I set the array up as variant, and do not specify the size, I can
set it equal to the range I want but then the line that calls BSearchNumber
gives me a compile error ("type mismatch: array or user defined type
expected"). I am calling BSearchNumber with a line like:
SomEInteger=BSearchNumber(TimeSArray, TimeSArray(j-1)+SomeConst) and was
hoping to loop through this for my 34000 searches (j is the loop counter).
I am sure I am making a very rudimentary error but so far have been unable
to spot it... Any suggestions? This bit of code is beginning to drive me up
the wall....
Many hanks for all your help.
With very best wishes, Boris.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top