Need Help with Delete SQL for Access

A

abaker.junk

Hi,

I have an MS Access Table that I use to store log information and it
looks something like this:

tblLog

ID Job SN Message Date
1 script1 abc123 Failure 1/1/2007
2 script2 abc123 Failure 1/1/2007
3 script2 abc123 Success 1/2/2007
4 script2 def456 Success 1/2/2007
5 script1 def456 Success 1/9/2007
6 script1 def456 Failure 1/10/2007
7 script1 abc123 Success 1/10/2007

I would like to create a query that would delete all of the old
information. In other words, I only want to keep the data from the
last time the script was run.

If I were going to select the data, I would use something like this:

SELECT tblLog.job, tblLog.SN, Max(tblLog.Date) AS MaxOfDate
FROM tblLog
GROUP BY tblLog.job, tblLog.SN

How would I create a delete statement that would delete everything but
what appears in this select statement?
 
G

Guest

Assuming that the ID field data only increments upwards:

DELETE *
FROM tblLog
WHERE tblLog.ID NOT IN (SELECT Max(tblLog.ID) AS MaxOfID
FROM tblLog
GROUP BY tblLog.Job, tblLog.SN);
 
G

Guest

I answered this yesterday! Did you cross-post it or did the forum software
hickup? I noticed that there are no new posts which is strange. Fortunately I
have my notes still.

Anyway, this works assuming that the ID field always increments to a larger
number.

DELETE *
FROM tblLog
WHERE tblLog.ID Not In (SELECT Max(tblLog.ID) AS MaxOfID
FROM tblLog
GROUP BY tblLog.Job, tblLog.SN);
 
G

Guest

You could create a delete query with a where statement where you supply the
date between certain dates.

docmd.runsql "DELETE * FROM tblLog WHERE date = " your criteria here...

hth
 
A

abaker.junk

I answered this yesterday! Did you cross-post it or did the forum software
hickup? I noticed that there are no new posts which is strange. Fortunately I
have my notes still.

Anyway, this works assuming that the ID field always increments to a larger
number.

DELETE *
FROM tblLog
WHERE tblLog.ID Not In (SELECT Max(tblLog.ID) AS MaxOfID
FROM tblLog
GROUP BY tblLog.Job, tblLog.SN);
--
Jerry Whittle, Microsoft Access MVP
Light. Strong. Cheap. Pick two. Keith Bontrager - Bicycle Builder.










- Show quoted text -

I posted this with google groups only to
microsoft.public.access.queries but I do see that you answered it
twice.

Your delete query works well in my sample table, but can we always
rely on ID incrementing to a larger #? ID is an autonumber field, if
the DB gets compacted, won't that allow newer records to have lower
IDs? Is there a way to do the query with the Date field? Also, I
hate to ask it, but what if you wanted to leave the latest 3 records
of a particular script. Is that possible with SQL? I usually resort
to outside code to do these functions but I would prefer to do it in
SQL.

Thanks for your responses!!
 
A

abaker.junk

The problem I have is that I don't know the date ahead of time. I
just want to keep a record of the last time a particular script was
run. So if the last record of a script running was 6 months ago, I
want to make sure I keep that record.
 
G

Guest

Use a subquery to identify the latest row per job/sn and delete all rows
where the date is less than the date returned by the subquery, e.g.

DELETE *
FROM tblLog AS T1
WHERE [Date] <
(SELECT Max([Date])
FROM tblLog AS T2
WHERE T2.job = T1.job
AND T2.SN= T1.SN);

Be sure to back up the table beforehand!

Ken Sheridan
Stafford, England
 
A

abaker.junk

This query worked very well on my simple test table with 7 entries,
but when I tried it on a copy of the real Access DB, it ran for many
hours before I cancelled it. I think there about about 400k entries
in the DB right now but I know there would be more like 30k if I was
able to delete the old records. Is there a way I can speed the query
up? I already indexed the SN and the Date fields. Thanks!

Use a subquery to identify the latest row per job/sn anddeleteall rows
where the date is less than the date returned by the subquery, e.g.

DELETE*
FROM tblLog AS T1
WHERE [Date] <
(SELECT Max([Date])
FROM tblLog AS T2
WHERE T2.job = T1.job
AND T2.SN= T1.SN);

Be sure to back up the table beforehand!

Ken Sheridan
Stafford, England



I have an MSAccessTable that I use to store log information and it
looks something like this:

ID Job SN Message Date
1 script1 abc123 Failure 1/1/2007
2 script2 abc123 Failure 1/1/2007
3 script2 abc123 Success 1/2/2007
4 script2 def456 Success 1/2/2007
5 script1 def456 Success 1/9/2007
6 script1 def456 Failure 1/10/2007
7 script1 abc123 Success 1/10/2007
I would like to create a query that woulddeleteall of the old
information. In other words, I only want to keep the data from the
last time the script was run.
If I were going to select the data, I would use something like this:
SELECT tblLog.job, tblLog.SN, Max(tblLog.Date) AS MaxOfDate
FROM tblLog
GROUP BY tblLog.job, tblLog.SN
How would I create adeletestatement that woulddeleteeverything but
what appears in this select statement?- Hide quoted text -

- Show quoted text -
 
J

John W. Vinson

This query worked very well on my simple test table with 7 entries,
but when I tried it on a copy of the real Access DB, it ran for many
hours before I cancelled it. I think there about about 400k entries
in the DB right now but I know there would be more like 30k if I was
able to delete the old records. Is there a way I can speed the query
up? I already indexed the SN and the Date fields. Thanks!

Use a subquery to identify the latest row per job/sn anddeleteall rows
where the date is less than the date returned by the subquery, e.g.

DELETE*
FROM tblLog AS T1
WHERE [Date] <
(SELECT Max([Date])
FROM tblLog AS T2
WHERE T2.job = T1.job
AND T2.SN= T1.SN);

Is there an index on Job as well?

You may have better luck with a JOIN rather than a subquery:

DELETE T1.*
FROM tblLog AS T1 INNER JOIN tblLog AS T2
ON t2.job = t1.job
AND t2.SN = t1.SN
AND t2.[Date] < t1.[Date];


John W. Vinson [MVP]
 
G

Guest

Another possibility might be:

DELETE *
FROM tblLog AS T1
WHERE EXISTS
(SELECT *
FROM tblLog AS T2
WHERE T2.job = T1.job
AND T2.SN = T1.SN
AND T2.[Date] > T1.[Date]);

This should delete any row where one or more rows exist in the table where
the job and SN columns match those in the current row and the date is greater
than that in the current row. Consequently the rows with the latest date per
job/SN would not be deleted. Again indexing is important.

Ken Sheridan
Stafford, England

This query worked very well on my simple test table with 7 entries,
but when I tried it on a copy of the real Access DB, it ran for many
hours before I cancelled it. I think there about about 400k entries
in the DB right now but I know there would be more like 30k if I was
able to delete the old records. Is there a way I can speed the query
up? I already indexed the SN and the Date fields. Thanks!

Use a subquery to identify the latest row per job/sn anddeleteall rows
where the date is less than the date returned by the subquery, e.g.

DELETE*
FROM tblLog AS T1
WHERE [Date] <
(SELECT Max([Date])
FROM tblLog AS T2
WHERE T2.job = T1.job
AND T2.SN= T1.SN);

Be sure to back up the table beforehand!

Ken Sheridan
Stafford, England



I have an MSAccessTable that I use to store log information and it
looks something like this:

ID Job SN Message Date
1 script1 abc123 Failure 1/1/2007
2 script2 abc123 Failure 1/1/2007
3 script2 abc123 Success 1/2/2007
4 script2 def456 Success 1/2/2007
5 script1 def456 Success 1/9/2007
6 script1 def456 Failure 1/10/2007
7 script1 abc123 Success 1/10/2007
I would like to create a query that woulddeleteall of the old
information. In other words, I only want to keep the data from the
last time the script was run.
If I were going to select the data, I would use something like this:
SELECT tblLog.job, tblLog.SN, Max(tblLog.Date) AS MaxOfDate
FROM tblLog
GROUP BY tblLog.job, tblLog.SN
How would I create adeletestatement that woulddeleteeverything but
what appears in this select statement?- Hide quoted text -

- Show quoted text -
 
A

abaker.junk

All of the suggestions in this thread seem to work great, but take way
too long on my database. I'm not sure why this would be as all of the
fields mentioned are indexed.

However, I did use what I learned here to create a new set of queries
that seem to work well:

The following creates a table called tblBAD that contains the primary
keys of the records that should be deleted.

Query #1
SELECT o.ID, o.Log_Date INTO tblBAD FROM tblLog AS o LEFT JOIN [SELECT
Last(tblLog.ID) AS LastOfID, tblLog.SN, Max(tblLog.Log_Date) AS
MaxOfLog_Date, tblLog.Job FROM tblLog GROUP BY tblLog.SN,
tblLog.Job ]. AS i ON o.ID = i.LastOfID WHERE (((i.LastOfID) Is
Null))

Then I run the following query that deletes all the records from
tblLog that exist in tblBAD that are older than 7 days.

Query #2
Delete tblLog.ID, DateDiff('d',tblLog.Log_Date,Now()) AS Expr1,
DateDiff('d',[tblLog].[Log_Date],Now()) FROM tblLog WHERE
(((tblLog.ID) In (SELECT tblBAD.ID FROM tblBAD)) AND
((DateDiff('d',[tblLog].[Log_Date],Now()))>7))

Even with 400k records, these 2 queries take less than a minute to
run. Can I trust these queries to give me constant results? I am a
little suspect of my method of selecting the primary key for the
record of the newest log. Can I trust Last(tblLog.ID) to always be
the record with the latest date?
 
J

John W. Vinson

Can I trust Last(tblLog.ID) to always be
the record with the latest date?

Emphatically NO, you CANNOT!

Last() is pretty near useless. It returns the last record *in disk storage
order*. This will pretty often be the same as the order in which records were
entered; but a) you can't be sure that records were entered newest last, and
b) you can't assume that the records WILL be stored on disk in entry order.
Access will put the records wherever in the table it finds convenient.

To get the record with the newest date you must use a Subquery or a DMax()
function to identify the newest record, or use a TOP VALUES query to return
the record with the largest date.

John W. Vinson [MVP]
 
A

abaker.junk

Ok, so I know the following query is bad:

SELECT Last(tblLog.ID) AS LastOfID, tblLog.Job, Max([tblLog.Date]) AS
MaxOfLog_Date
FROM tblLog
GROUP BY tblLog.Job
HAVING (((tblLog.Job)='script1'))

Can you show me an example of how to do the same this with subqueries,
DMax() or TOP VALUE? The reason why I settled on Last() was that I
could not get any other method to return a single primary key. I also
considered Max(ID) but this database is going to get a lot of turn
over in records and I figured Access is going to start recycling
primary key auto numbers at some point.

I apologize in advance for my ignorance.
 
N

neurodetonal

SQL example: The inner select will return the latest run date for each
corresponding combination of Job and SN. The outer select reassociates
the ID even if the max date for a particular combination is stored in
an earlier ID.

SELECT a.LogID, a.Job, a.SN, a.Date
FROM tblLog a INNER JOIN (
SELECT b.Job, SN, MAX(b.Date) MaxDate
FROM tblLog b
GROUP BY b.Job, b.SN
) c ON a.Job = c.Job AND a.SN = c.SN AND a.Date = c.MaxDate


Using the select within a delete statement to remove all other
records:
DELETE FROM tblLog
WHERE LogID NOT IN (
SELECT a.LogID
FROM tblLog a INNER JOIN (
SELECT b.Job, SN, MAX(b.Date) MaxDate
FROM tblLog b
GROUP BY b.Job, b.SN
) c ON a.Job = c.Job AND a.SN = c.SN AND a.Date = c.MaxDate
)


An example in Access:
www.vistitude.com/temp/ScriptLog.zip


If dupes of Job and SN on the same day are an issue, modify as
follows. Essentially, you pick an arbitrary ID to go with, in this
case max ID.

SELECT MAX(a.LogID), a.Job, a.SN, a.Date
FROM tblLog a INNER JOIN (
SELECT b.Job, SN, MAX(b.Date) MaxDate
FROM tblLog b
GROUP BY b.Job, b.SN
) c ON a.Job = c.Job AND a.SN = c.SN AND a.Date = c.MaxDate
GROUP BY a.Job, a.SN, a.Date


If processing speed is still an issue, you want to use my select with
your two query method for deleting.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top