Deleting Duplicates

G

Guest

Ok, I have made a "Find Duplicates" query for my program. How can I
automatically delete the data with the lower idsArchiveID number?

Some background. Users are able to archive their data daily but sometimes
they archive twice, or more, in one day. When they go back to retrieve their
data they get every record that was archived and they cannot tell which data
is the most recent. How can I modify my "Find Duplicates" query to
automatically delete those records with the smaller ID number so that each
Bureau Number has one entry per date? Is there someway to use the Dmax
function? Any suggestions would be great.

Data:

Field: intBureauNumber
Table: tblAircraftDataArchive
Sort: Ascending
Criteria: In (SELECT [intBureauNumber] FROM [tblAircraftDataArchive] As Tmp
GROUP BY [intBureauNumber],[datStorageDate] HAVING Count(*)>1 And
[datStorageDate] = [tblAircraftDataArchive].[datStorageDate])

Field: datStorageDate
Table: tblAircraftDataArchive
Sort: Ascending

Field: idsArchiveID
Table: tblArchiveDataArchive
Sort: Ascending

intBureauNumber datStorageDate idsArchiveID Need to delete
157329 5/2/2005 3060 <-----
157329 5/2/2005 3087
157329 5/4/2005 3166 <-----
157329 5/4/2005 3173 <-----
157329 5/4/2005 3230 <-----
157329 5/4/2005 3237

Thanks,

Jeff
 
A

Allen Browne

The simplest way to dedupe is to turn your query into a Make Table query
(Make Table on Query menu). This gives you a new, deduplicated table, which
replaces the original.

The grouping query should be able to select Min of idsArchiveID for you.
 
G

Guest

Morning Allen,

Thanks for the feedback.

The problem would be that using Min would only show those entries with the
lowest ID number. There are multiple entries for one date so I need to
delete ALL lower ID numbers for that date.

I had some other advise and changed the original query to a Total query and
grouped the Bureau and Date fields and changed the ArchiveID to Max and that
brought up the maximum ID number for those Bureau numbers that had multiple
entries.

I then tried to make another query using the ArchiveID field with the
criteria as 'Not In (Select idsArchiveID FROM
FindduplicatesfortblAircraftDataArchive)' but the dynaset is blank. The
formula returns nothing.

Any ideas on how I can call up the data that is not in the other query?

Jeff

Allen Browne said:
The simplest way to dedupe is to turn your query into a Make Table query
(Make Table on Query menu). This gives you a new, deduplicated table, which
replaces the original.

The grouping query should be able to select Min of idsArchiveID for you.
--
Allen Browne - Microsoft MVP. Perth, Western Australia.

Reply to group, rather than allenbrowne at mvps dot org.

Jeff said:
Ok, I have made a "Find Duplicates" query for my program. How can I
automatically delete the data with the lower idsArchiveID number?

Some background. Users are able to archive their data daily but sometimes
they archive twice, or more, in one day. When they go back to retrieve
their
data they get every record that was archived and they cannot tell which
data
is the most recent. How can I modify my "Find Duplicates" query to
automatically delete those records with the smaller ID number so that each
Bureau Number has one entry per date? Is there someway to use the Dmax
function? Any suggestions would be great.

Data:

Field: intBureauNumber
Table: tblAircraftDataArchive
Sort: Ascending
Criteria: In (SELECT [intBureauNumber] FROM [tblAircraftDataArchive] As
Tmp
GROUP BY [intBureauNumber],[datStorageDate] HAVING Count(*)>1 And
[datStorageDate] = [tblAircraftDataArchive].[datStorageDate])

Field: datStorageDate
Table: tblAircraftDataArchive
Sort: Ascending

Field: idsArchiveID
Table: tblArchiveDataArchive
Sort: Ascending

intBureauNumber datStorageDate idsArchiveID Need to delete
157329 5/2/2005 3060 <-----
157329 5/2/2005 3087
157329 5/4/2005 3166 <-----
157329 5/4/2005 3173 <-----
157329 5/4/2005 3230 <-----
157329 5/4/2005 3237

Thanks,

Jeff
 
A

Allen Browne

Think I missed what the problem is here.

If you GROUP BY intBureauNumber and datStorageDate, and ask for
min(idsArchiveID), you get the lowest Archive number for the date (which is
what you want?) Then if you write that to a table, the new table lacks all
the other higher values of the archive number. Is that not the result you
want?

--
Allen Browne - Microsoft MVP. Perth, Western Australia.

Reply to group, rather than allenbrowne at mvps dot org.

Jeff said:
Morning Allen,

Thanks for the feedback.

The problem would be that using Min would only show those entries with the
lowest ID number. There are multiple entries for one date so I need to
delete ALL lower ID numbers for that date.

I had some other advise and changed the original query to a Total query
and
grouped the Bureau and Date fields and changed the ArchiveID to Max and
that
brought up the maximum ID number for those Bureau numbers that had
multiple
entries.

I then tried to make another query using the ArchiveID field with the
criteria as 'Not In (Select idsArchiveID FROM
FindduplicatesfortblAircraftDataArchive)' but the dynaset is blank. The
formula returns nothing.

Any ideas on how I can call up the data that is not in the other query?

Jeff

Allen Browne said:
The simplest way to dedupe is to turn your query into a Make Table query
(Make Table on Query menu). This gives you a new, deduplicated table,
which
replaces the original.

The grouping query should be able to select Min of idsArchiveID for you.

Jeff said:
Ok, I have made a "Find Duplicates" query for my program. How can I
automatically delete the data with the lower idsArchiveID number?

Some background. Users are able to archive their data daily but
sometimes
they archive twice, or more, in one day. When they go back to retrieve
their
data they get every record that was archived and they cannot tell which
data
is the most recent. How can I modify my "Find Duplicates" query to
automatically delete those records with the smaller ID number so that
each
Bureau Number has one entry per date? Is there someway to use the Dmax
function? Any suggestions would be great.

Data:

Field: intBureauNumber
Table: tblAircraftDataArchive
Sort: Ascending
Criteria: In (SELECT [intBureauNumber] FROM [tblAircraftDataArchive] As
Tmp
GROUP BY [intBureauNumber],[datStorageDate] HAVING Count(*)>1 And
[datStorageDate] = [tblAircraftDataArchive].[datStorageDate])

Field: datStorageDate
Table: tblAircraftDataArchive
Sort: Ascending

Field: idsArchiveID
Table: tblArchiveDataArchive
Sort: Ascending

intBureauNumber datStorageDate idsArchiveID Need to delete
157329 5/2/2005 3060 <-----
157329 5/2/2005 3087
157329 5/4/2005 3166 <-----
157329 5/4/2005 3173 <-----
157329 5/4/2005 3230 <-----
157329 5/4/2005 3237
 
G

Guest

No, I need to keep the highest number since that is the most up to date
information. I found a website that discusses the SQL for this type of
problem.

http://www.15seconds.com/issue/011009.htm

I read through the SQL code and figured out how to put one together for my
data. It seems to be working well now. I've been able to go through each
user's data and delete all the duplicated data they have saved and I've put a
command button on their forms in order to clean up the archive data if they
need to.

Thanks for your help.

Jeff

Allen Browne said:
Think I missed what the problem is here.

If you GROUP BY intBureauNumber and datStorageDate, and ask for
min(idsArchiveID), you get the lowest Archive number for the date (which is
what you want?) Then if you write that to a table, the new table lacks all
the other higher values of the archive number. Is that not the result you
want?

--
Allen Browne - Microsoft MVP. Perth, Western Australia.

Reply to group, rather than allenbrowne at mvps dot org.

Jeff said:
Morning Allen,

Thanks for the feedback.

The problem would be that using Min would only show those entries with the
lowest ID number. There are multiple entries for one date so I need to
delete ALL lower ID numbers for that date.

I had some other advise and changed the original query to a Total query
and
grouped the Bureau and Date fields and changed the ArchiveID to Max and
that
brought up the maximum ID number for those Bureau numbers that had
multiple
entries.

I then tried to make another query using the ArchiveID field with the
criteria as 'Not In (Select idsArchiveID FROM
FindduplicatesfortblAircraftDataArchive)' but the dynaset is blank. The
formula returns nothing.

Any ideas on how I can call up the data that is not in the other query?

Jeff

Allen Browne said:
The simplest way to dedupe is to turn your query into a Make Table query
(Make Table on Query menu). This gives you a new, deduplicated table,
which
replaces the original.

The grouping query should be able to select Min of idsArchiveID for you.

Ok, I have made a "Find Duplicates" query for my program. How can I
automatically delete the data with the lower idsArchiveID number?

Some background. Users are able to archive their data daily but
sometimes
they archive twice, or more, in one day. When they go back to retrieve
their
data they get every record that was archived and they cannot tell which
data
is the most recent. How can I modify my "Find Duplicates" query to
automatically delete those records with the smaller ID number so that
each
Bureau Number has one entry per date? Is there someway to use the Dmax
function? Any suggestions would be great.

Data:

Field: intBureauNumber
Table: tblAircraftDataArchive
Sort: Ascending
Criteria: In (SELECT [intBureauNumber] FROM [tblAircraftDataArchive] As
Tmp
GROUP BY [intBureauNumber],[datStorageDate] HAVING Count(*)>1 And
[datStorageDate] = [tblAircraftDataArchive].[datStorageDate])

Field: datStorageDate
Table: tblAircraftDataArchive
Sort: Ascending

Field: idsArchiveID
Table: tblArchiveDataArchive
Sort: Ascending

intBureauNumber datStorageDate idsArchiveID Need to delete
157329 5/2/2005 3060 <-----
157329 5/2/2005 3087
157329 5/4/2005 3166 <-----
157329 5/4/2005 3173 <-----
157329 5/4/2005 3230 <-----
157329 5/4/2005 3237
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top