Finding the overall nearest match to a group of parameters (number

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

The real-world problem is this: I have a large number of facilities around
the world that I want compare based on their operating
parameters/configuration.

The parameters are expressed numerically: E.g., Facility A has 15 machines,
1200sqm, 20 employees, 5 shipping doors, Facility B has 12 machines, 1000sqm,
12 employees, 6 shipping doors, ...

As each facility is somewhat unique, I am trying to find a way in MS Access
that allows me to identify those facilities that are the nearest/closest
overall match to a selected facility. All parameters are of equal weight.

Is there a simple way?

Thanks!!
 
Frank

How would you instruct a person to conduct this "match-up"?

Access is less smart than any person you would have help you -- you'll have
to be very explicit.

Regards

Jeff Boyce
Microsoft Office/Access MVP
 
Hello Jeff:

I worded the problem broadly because I was hoping that there is a function
that I just have simply overlooked. What I am looking for is a function that
filters records based on their closest match to a given number. (In a single
iteration, it would be analog to a VLookup-Function in Excel with the
range-lookup=TRUE, for the overall result one would have to run multiple
iterations analog to the solver function in Excel).

In terms of the actual process, the user goes manually through a series of
filters starting with geography (region), facility production volume, the
relevant units of equipment, etc. until the user has identified a group of
facilities that are most similar (most "like") to the facility in question.

To give a specific example, one location has 1199sqm and another has
1201sqm, one location produces 999 units per year, another 1000.2.
Essentially, they have the same size and volume and the user would see it
that way. However, MS Access will not because the values are different. I
started out using the BETWEEN criteria in the query but it is very awkward to
use as it needs to be adjusted for every iteration. I have started to write
code to make the BETWEEN range definition dynamic, but it is becoming very
complex (Standard Evolutionary) and almost not worth the effort.

Any advice on what other paths I could take?

Frank
 
Calculate the absolute differences would be something like the following,
but on a volume of 1000 units a difference of 100 might be significant
while on a volume of 100,000 that difference is probably not significant.

SELECT A.Id, B.ID,
Abs(A.Volume -B.Volume) as VolumeDiff,
Abs(A.Employees-B.Employees) as EmployeeDiff.
Abs(A.Size - B.Size) = SizeDiff
FROM Facilities as A, Facilities as B
WHERE A.ID <> B.ID

So I think you still need to define what constitutes a significant
difference. Is it a ratio of a to b, the absolute difference, etc. And
with multiple factors do you add up the differences, average them, or use
some other algorithm to decide.

--
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
..
 
Frank

I suppose one way of comparing two sets of field values would be to use
something akin to the Chi Squared test (see Stats 101).

But you could also do a first approximation by calculating the percentage in
each field between pairs and adding the percentages (you did say all fields
were equally significant) before dividing by the number of fields. The
"average" percentage closest to 100% would be one interpretation of a
"closest match."

Good luck

Regards

Jeff Boyce
Microsoft Office/Access MVP
 
Thank you, John, great idea!
By calculating the absolute difference and expressing the difference in
percent, MIN of summed percentage per record should be the best match.
This is certainly better than the road I was about to take... Thank you!
Frank
 
Great!! Thank you, Jeff


Jeff Boyce said:
Frank

I suppose one way of comparing two sets of field values would be to use
something akin to the Chi Squared test (see Stats 101).

But you could also do a first approximation by calculating the percentage in
each field between pairs and adding the percentages (you did say all fields
were equally significant) before dividing by the number of fields. The
"average" percentage closest to 100% would be one interpretation of a
"closest match."

Good luck

Regards

Jeff Boyce
Microsoft Office/Access MVP
 
Still pretty vague.

Using the standard metric for distance in an un-weighted
single dimension geometry ;-)

SELECT TOP 1
A.Facility, B.Facility,
Abs(A.Machines - B.Machines) +
Abs(A.sqm- B.sqm) +
Abs(A.employees- B.employees) +
. . .
As TotalMeasure
FROM table As A, table As B
WHERE A.Facility = [Enter Facility]
ORDER BY 3
 
Thank you, Marshall!
I normalized the absolute difference by parameter by expressing it as a
percent and summed up the total across a record, picking the lowest sum.
Frank
Marshall Barton said:
Still pretty vague.

Using the standard metric for distance in an un-weighted
single dimension geometry ;-)

SELECT TOP 1
A.Facility, B.Facility,
Abs(A.Machines - B.Machines) +
Abs(A.sqm- B.sqm) +
Abs(A.employees- B.employees) +
. . .
As TotalMeasure
FROM table As A, table As B
WHERE A.Facility = [Enter Facility]
ORDER BY 3
--
Marsh
MVP [MS Access]

I worded the problem broadly because I was hoping that there is a function
that I just have simply overlooked. What I am looking for is a function that
filters records based on their closest match to a given number. (In a single
iteration, it would be analog to a VLookup-Function in Excel with the
range-lookup=TRUE, for the overall result one would have to run multiple
iterations analog to the solver function in Excel).

In terms of the actual process, the user goes manually through a series of
filters starting with geography (region), facility production volume, the
relevant units of equipment, etc. until the user has identified a group of
facilities that are most similar (most "like") to the facility in question.

To give a specific example, one location has 1199sqm and another has
1201sqm, one location produces 999 units per year, another 1000.2.
Essentially, they have the same size and volume and the user would see it
that way. However, MS Access will not because the values are different. I
started out using the BETWEEN criteria in the query but it is very awkward to
use as it needs to be adjusted for every iteration. I have started to write
code to make the BETWEEN range definition dynamic, but it is becoming very
complex (Standard Evolutionary) and almost not worth the effort.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top