clean up city, state and zip

S

Song Su

I have downloaded a zipcode database and want to use it to find mistakes in
my table with 20,000 records. The zipcode database sample records as
follows: (ZipID is auto number primary)

City State Zip DefaultCity (Yes/No type:
Yes means default city; No means acceptable city)
Monterey Park CA 91754 Yes
Hacienda Heights CA 91745 Yes
Industry CA 91745 No
La Puente CA 91745 No

My table has student ID as primary key. Sample records as follows:


Monterey Park CA 91745 (wrong zip code, should be selected)
Monterey Pk CA 91754 (wrong city spelling, should be
selected)
Monterey Park LA 91754 (wrong state, should be selected)

however,

Industry CA 91745 (should NOT be selected as it's
acceptable city for that zipcode)
La Puente CA 91745 (same as above)

How to set up query (or queries) to select wrong zipcode, misspelled city or
state but not to select acceptable cities?

Thanks
 
J

John Spencer

For wrong City or State you could use
SELECT *
FROM StudentTable as S INNER JOIN ZipTable as Z
ON S.ZIP = Z.Zip
WHERE S.State <> Z.State OR S.City NOT IN
(SELECT City FROM ZipTable as Z2 WHERE Z2.Zip = Z.Zip)

The correlated subquery above may be a bit slow. IF you wish you limit
matches to only the default city you could use
SELECT *
FROM StudentTable as S INNER JOIN ZipTable as Z
ON S.ZIP = Z.Zip
WHERE S.State <> Z.State OR
(S.City <> Z.City and Z.DefaultCity = True)



Wrong ZIP
SELECT *
FROM StudentTable as S INNER JOIN ZipTable as Z
ON S.City = Z.City AND S.State=Z.State
WHERE S.ZIP <> Z.ZIP

No match at all
SELECT *
FROM StudentTable as S LEFT JOIN ZipTable as Z
ON S.City = Z.City AND S.State=Z.State AND S.Zip = Z.ZIP
WHERE Z.Zip Is Null



'====================================================
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
'====================================================
 
S

Song Su

Thanks, John, for your quick reply. I don't know what 'correlated subquery'
means. I want to create query and then a report based on this query. How to
put these queries together so my report can be based on?

The only thing I could not figure out is 'acceptable city' which should not
be result of this query.

-- Song
 
J

John Spencer

A correlated subquery is a query that gets data from the main query to
decide what it should be returning. So the first query I posted needs to
know the zip from the main query to decide which cities to return for
the criteria to validate the city name.


To put the queries together, I would use a UNION Query.

SELECT *
FROM StudentTable as S INNER JOIN ZipTable as Z
ON S.ZIP = Z.Zip
WHERE S.State <> Z.State OR S.City NOT IN
(SELECT City FROM ZipTable as Z2 WHERE Z2.Zip = Z.Zip)

UNION ALL


SELECT *
FROM StudentTable as S INNER JOIN ZipTable as Z
ON S.City = Z.City AND S.State=Z.State
WHERE S.ZIP <> Z.ZIP

UNION ALL

SELECT *
FROM StudentTable as S LEFT JOIN ZipTable as Z
ON S.City = Z.City AND S.State=Z.State AND S.Zip = Z.ZIP
WHERE Z.Zip Is Null

UNION Queries cannot be built in the query grid view. You can build the
three separate queries and save them. Then you can use the three
separate queries in a union query.

SELECT * FROM QueryONE
UNION ALL
SELECT * FROM QueryTWO
UNION ALL
SELECT * FROM QueryTHREE

It is important that all three queries have the same number of fields
and the fields must be in the same order. Otherwise, the UNION query
will fail,


'====================================================
John Spencer
Access MVP 2002-2005, 2007
Center for Health Program Development and Management
University of Maryland Baltimore County
'====================================================
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top