How to create a table of unique localities from a mass of duplicates?

T

Tina

I have to warn you that I'm an Access semi-novice - I can make search
queries and update and append readily, but I've never used SQL before.
If what I am trying to do is too advanced for simple query building,
then I'll just hobble on as I am doing.

I have a table with three fields: Locality, Latitude, Longitude. Each
field has multiple duplicates, but they are not necessarily connected.

For example, we have many Localities that are exactly the same, with
different latitudes and longitudes. We have some records with the same
Locality and Latitude, but differing Longitudes, and others with the
same Locality and Longitude, but differing Latitudes.


I'd like to make a query that pulls out all unique Localities that have
differing latitudes and longitudes. A simple "Find Duplicates" query
based on Latitude misses a few records whose latitudes are the same but
the longitudes are different. What I have thought about doing is making
one table with the unique latidudes, another with unique longitudes,
appending them together, and then deleting all duplicates based on the
Locality name. But I feel like 1. I am missing something and 2. There
must be an easier way to do this.
 
J

John Vinson

I'd like to make a query that pulls out all unique Localities that have
differing latitudes and longitudes.

A "Self Join" query will work neatly for this.

Create a new query by adding your table to the query grid twice.
Access will alias the second instance by appending _1 to its name.

Join the two instances on Locality. Select both pairs of latitudes and
longitude.

As a criterion on Table_1.Latitude put

<>
.[Latitude]

and likewise for longitude. Or, you might want to allow for "almost"
matches - put in a calculated field

DistError: Abs(
.[latitude] - [table_1].[latitude]) +
Abs(
.[longitude] - [table_1].[longitude])

and use some criterion such as < 0.01 to find cases where there are
two Localities with the same name within about six miles of one
another (or adjust the criterion as needed).

I would suggest NOT storing these results in a table - just use the
Query as the recordsource for a form or report.


John W. Vinson[MVP]
 
T

Tina

Hi John, thanks for the reply! I must be doing something wrong -

I made a subset of the data into a new table with five records, which
includes the same record twice, to test this on - the new table is
named "test." Made a new Select Query, and added the table twice - one
is named "test" and the alias is "test_1."

Joined "test" and "test_1" on the Locality field. Double-clicked on
the "Latitude" and "Longitude" field from both "test" and "test_1" to
add them to the query.

test_1.Latitude's criteria is

<> [test].[Latitude]

On the same line for test_1.Longitude's criteria is

<> [test].[Longitude]

I run the query, and I get no results.

I must be misreading your directions...

John said:
I'd like to make a query that pulls out all unique Localities that have
differing latitudes and longitudes.

A "Self Join" query will work neatly for this.

Create a new query by adding your table to the query grid twice.
Access will alias the second instance by appending _1 to its name.

Join the two instances on Locality. Select both pairs of latitudes and
longitude.

As a criterion on Table_1.Latitude put

<>
.[Latitude]

and likewise for longitude. Or, you might want to allow for "almost"
matches - put in a calculated field

DistError: Abs(
.[latitude] - [table_1].[latitude]) +
Abs(
.[longitude] - [table_1].[longitude])

and use some criterion such as < 0.01 to find cases where there are
two Localities with the same name within about six miles of one
another (or adjust the criterion as needed).

I would suggest NOT storing these results in a table - just use the
Query as the recordsource for a form or report.


John W. Vinson[MVP]
 
J

John Vinson

I made a subset of the data into a new table with five records, which
includes the same record twice, to test this on - the new table is
named "test." Made a new Select Query, and added the table twice - one
is named "test" and the alias is "test_1."

Your description sounds correct. Please post the SQL view of the
query, and the five records from the table... something's not right
here.

Just to verify: my intent was to find instances where the same value
of Location exists in two (or more) records, with different values of
Latitude and Longitude.

John W. Vinson[MVP]
 
R

Ron2006

I think you ALMOST have it.

You are saying in your test that BOTH the latitude AND the longitude
must be different.
If one of them is different but not the other it will NOT be selected.

If you want either of them to be different then I think it should be:

test_1.Latitude's criteria is
<> [test].[Latitude]


On the NEXT line for test_1.Longitude's criteria is
<> [test].[Longitude]

Ron
 
T

Tina

Ah! Got it! Thanks so much to both of you!

So, just to clarify it in my head, if I had a record set that looked
like the following:

NamedPlace DecLat DecLong
Charlie Taylors Mountain 37.7262 -79.2761
Lowesville 37.7348 -79.0683
Lowesville 37.7348 -79.0683
Accomack County 37.74045 -75.64303
Accomack County 37.74045 -78.1776

and I wanted the final results set to look like:

NamedPlace DecLat DecLong
Charlie Taylors Mountain 37.7262 -79.2761
Lowesville 37.7348 -79.0683
Accomack County 37.74045 -75.64303
Accomack County 37.74045 -78.1776

I could run the query that John set up for me to see that there are two
Accomack Counties with differing lat longs. I could then run a find
duplicates query to see the duplicate Lowesvilles, and then I could run
a third query that contained the "leftovers" and the three combined
would give me the above table. Does that sound feasible?
 
J

John Vinson

So, just to clarify it in my head, if I had a record set that looked
like the following:

NamedPlace DecLat DecLong
Charlie Taylors Mountain 37.7262 -79.2761
Lowesville 37.7348 -79.0683
Lowesville 37.7348 -79.0683
Accomack County 37.74045 -75.64303
Accomack County 37.74045 -78.1776

and I wanted the final results set to look like:

NamedPlace DecLat DecLong
Charlie Taylors Mountain 37.7262 -79.2761
Lowesville 37.7348 -79.0683
Accomack County 37.74045 -75.64303
Accomack County 37.74045 -78.1776

Umm...?!

That's not what you asked for. You asked to be able to find duplicate
names at different locations.

This is much simpler:

SELECT DISTINCT NamedPlace, DecLat, DecLong
FROM tablename;

The DISTINCT keyword (view the query's Properties and set the Unique
Values property to True if you're using the grid) will weed out the
dups.

John W. Vinson[MVP]
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top