Duplicated records

Pietro · Jul 6, 2008

Hi,

I've a problem in my database,sometimes,due to the several equationas and
codes i'm using,records are duplicated,how can i run a delete query to delete
records that have the same value of the fields "SR" and "Received",I need to
keep one reocrd only and delete all the other repeated records.

Guest · Jul 6, 2008

Make a back up copy of the database first.

The table must have a primary key.

The table must have a numeric index.

If the primary key is not a numeric index, add a numeric
index to the table.

Make a select query that joins the table to another
copy of the same table. Join all the duplicate records.

Then delete all the records where the index on the joined
table is greater than the index on the first copy of the
table

(david)

Ken Sheridan · Jul 6, 2008

Firstly the table must have a primary key column or an otherwise uniquely
indexed column. If not add an autonumber column, MyID say. If the primary
key is a multi-column one then it can still be done, but its simpler with a
single column, so in this case add the autonumber column temporarily; you can
delete it again later. Then execute the following Delete query:

DELETE *
FROM YourTable As T1
WHERE EXISTS
(SELECT *
FROM YourTable AS T2
WHERE T2.SR = T1.SR
AND T2.Received = T1.Received
AND T2.MyID < T1.MyID);

Be sure to back-up the table first.

Ken Sheridan
Stafford, England

Pietro · Jul 6, 2008

Thank you Ken for your reply...
The below query does not work,

Ken Sheridan · Jul 6, 2008

That doesn't give me much of a clue. In what way is it not working? Is it
reporting an error, and if so what, or just not doing what you expect?

You have, I assume, changed 'YourTable' in the query to the actual name of
your table? It should then work if:

1. The table has columns named SR and Received.

2. The table has a primary key or otherwise uniquely indexed column named
MyID. If its named something else then you'll also need to change 'MyID' in
the query to the actual name.

3. The table contains subsets of rows where the values in the SR and
Received columns are the same in all rows in the subset, and you want to
delete all but one row from each subset.

Ken Sheridan
Stafford, England

Pietro · Jul 6, 2008

After changing the names into my database's ones,here's the code:

Expr1: Exists (SELECT FROM cap AS T2 WHERE [T2].[service request]
=[T1].[service request] AND T2.Received = T1.Received AND T2.ID < T1.ID)

Note that the exact name of the field SR is "Service Request"

I get an error message "Syntax error in expression.

Ken Sheridan · Jul 6, 2008

You are missing the asterisk after SELECT in the subquery. It should read:

(SELECT * FROM capâ€¦â€¦..

The EXISTS predicate simple tests for at least one row being returned by the
subquery for the current row in the outer query. The asterisk is shorthand
for 'all columns'. Using it in this context allows the optimal use of any
indexes and speeds performance.

Also, if you have opened it in query design view rather than SQL view make
sure the 'show' check-box for the column with the subquery is unchecked.

If you still have problems switch to SQL view and copy the entire SQL and
post it back here.

Ken Sheridan
Stafford, England

Pietro said:
After changing the names into my database's ones,here's the code:

Expr1: Exists (SELECT FROM cap AS T2 WHERE [T2].[service request]
=[T1].[service request] AND T2.Received = T1.Received AND T2.ID < T1.ID)

Note that the exact name of the field SR is "Service Request"

I get an error message "Syntax error in expression.

Ken Sheridan said:

That doesn't give me much of a clue. In what way is it not working? Is it
reporting an error, and if so what, or just not doing what you expect?

You have, I assume, changed 'YourTable' in the query to the actual name of
your table? It should then work if:

1. The table has columns named SR and Received.

2. The table has a primary key or otherwise uniquely indexed column named
MyID. If its named something else then you'll also need to change 'MyID' in
the query to the actual name.

3. The table contains subsets of rows where the values in the SR and
Received columns are the same in all rows in the subset, and you want to
delete all but one row from each subset.

Ken Sheridan
Stafford, England

Click to expand...

Pietro · Jul 6, 2008

Hi,

Here's nthe exact code I've used (In SQL)
DELETE * FROM cap As T1 WHERE EXISTS
(SELECT* FROM cap AS T2 WHERE [T2].[service request]
=[T1].[service request] AND T2.Received = T1.Received AND T2.ID < T1.ID)

The query kept loading for long time and did not give me any result..
Thank you anyway for your patience
Waiting for your reply ASAP

Ken Sheridan said:
You are missing the asterisk after SELECT in the subquery. It should read:

(SELECT * FROM capâ€¦â€¦..

The EXISTS predicate simple tests for at least one row being returned by the
subquery for the current row in the outer query. The asterisk is shorthand
for 'all columns'. Using it in this context allows the optimal use of any
indexes and speeds performance.

Also, if you have opened it in query design view rather than SQL view make
sure the 'show' check-box for the column with the subquery is unchecked.

If you still have problems switch to SQL view and copy the entire SQL and
post it back here.

Ken Sheridan
Stafford, England

Pietro said:

After changing the names into my database's ones,here's the code:

Expr1: Exists (SELECT FROM cap AS T2 WHERE [T2].[service request]
=[T1].[service request] AND T2.Received = T1.Received AND T2.ID < T1.ID)

Note that the exact name of the field SR is "Service Request"

I get an error message "Syntax error in expression.

Ken Sheridan said:

That doesn't give me much of a clue. In what way is it not working? Is it
reporting an error, and if so what, or just not doing what you expect?

You have, I assume, changed 'YourTable' in the query to the actual name of
your table? It should then work if:

1. The table has columns named SR and Received.

2. The table has a primary key or otherwise uniquely indexed column named
MyID. If its named something else then you'll also need to change 'MyID' in
the query to the actual name.

3. The table contains subsets of rows where the values in the SR and
Received columns are the same in all rows in the subset, and you want to
delete all but one row from each subset.

Ken Sheridan
Stafford, England

:

Thank you Ken for your reply...
The below query does not work,

:

Firstly the table must have a primary key column or an otherwise uniquely
indexed column. If not add an autonumber column, MyID say. If the primary
key is a multi-column one then it can still be done, but its simpler with a
single column, so in this case add the autonumber column temporarily; you can
delete it again later. Then execute the following Delete query:

DELETE *
FROM YourTable As T1
WHERE EXISTS
(SELECT *
FROM YourTable AS T2
WHERE T2.SR = T1.SR
AND T2.Received = T1.Received
AND T2.MyID < T1.MyID);

Be sure to back-up the table first.

Ken Sheridan
Stafford, England

:

Hi,

I've a problem in my database,sometimes,due to the several equationas and
codes i'm using,records are duplicated,how can i run a delete query to delete
records that have the same value of the fields "SR" and "Received",I need to
keep one reocrd only and delete all the other repeated records.

Click to expand...

Click to expand...

Ken Sheridan · Jul 6, 2008

The SQL looks fine; there is no space between SELECT and the asterisk in the
subquery, but that is in fact permissible. The speed issue may be helped by
indexing the service request and received columns, both non-uniquely. The ID
column should already be indexed uniquely of course.

You can see how it works by creating a similar query in the sample Northwind
database Use a normal SELECT query so as to avoid the risk of accidental
deletions, but the principle is exactly the same:

SELECT * FROM [Order Details] As T1
WHERE EXISTS
(SELECT * FROM[Order Details] AS T2
WHERE T2.ProductID=T1.ProductID
AND T2.Quantity = T1.Quantity
AND T2.OrderID < T1.OrderID)
ORDER BY ProductID, Quantity;

You'll see that this returns all rows per product with the same productID
and quantity values, apart from those with the lowest orderID per product.
So if this were changed to a DELETE query it would delete these rows, leaving
only one per productID/quantity. That's not something anyone would wish to
do of course but it illustrates how a query of this nature works.

Another way you can ditch the redundant rows is to firstly copy the table to
the clipboard and then paste its 'structure only' as a new table under a new
name. Delete the ID column and make the service request and received columns
the primary key of the new table. To do this in table design view highlight
both fields by dragging over the field selectors (the small grey square to
the far left) of both. Right click and select primary key from the shortcut
menu. You should seethe key symbol against both fields. Save the amended
design of the empty new table.

Then create an append query to append all columns from your original table,
apart from ID, to the new empty table. When you run this append query only
one row per service request/received values will be accepted due to the key
violations, so you'll end up with a single row for each. Once you are happy
that you have the correct data you can then delete the original table and
rename the new one back to the original's name. By having the composite
primary key the insertion of any more invalid duplicate rows will be
prevented of course. If you want to reference this table with a foreign key
in another table, the foreign key would have to be a composite one of the
service request and received columns of course. What you'd be doing is using
'natural' keys rather than 'surrogate' keys. However, you should not need to
do this as the DELETE query should work.

I'm now signing off for today.

Ken Sheridan
Stafford, England

Pietro said:
Hi,

Here's nthe exact code I've used (In SQL)
DELETE * FROM cap As T1 WHERE EXISTS
(SELECT* FROM cap AS T2 WHERE [T2].[service request]
=[T1].[service request] AND T2.Received = T1.Received AND T2.ID < T1.ID)

The query kept loading for long time and did not give me any result..
Thank you anyway for your patience
Waiting for your reply ASAP

Ken Sheridan said:

You are missing the asterisk after SELECT in the subquery. It should read:

(SELECT * FROM capâ€¦â€¦..

The EXISTS predicate simple tests for at least one row being returned by the
subquery for the current row in the outer query. The asterisk is shorthand
for 'all columns'. Using it in this context allows the optimal use of any
indexes and speeds performance.

Also, if you have opened it in query design view rather than SQL view make
sure the 'show' check-box for the column with the subquery is unchecked.

If you still have problems switch to SQL view and copy the entire SQL and
post it back here.

Ken Sheridan
Stafford, England

Pietro said:

After changing the names into my database's ones,here's the code:

Expr1: Exists (SELECT FROM cap AS T2 WHERE [T2].[service request]
=[T1].[service request] AND T2.Received = T1.Received AND T2.ID < T1.ID)

Note that the exact name of the field SR is "Service Request"

I get an error message "Syntax error in expression.

:

That doesn't give me much of a clue. In what way is it not working? Is it
reporting an error, and if so what, or just not doing what you expect?

You have, I assume, changed 'YourTable' in the query to the actual name of
your table? It should then work if:

1. The table has columns named SR and Received.

2. The table has a primary key or otherwise uniquely indexed column named
MyID. If its named something else then you'll also need to change 'MyID' in
the query to the actual name.

3. The table contains subsets of rows where the values in the SR and
Received columns are the same in all rows in the subset, and you want to
delete all but one row from each subset.

Ken Sheridan
Stafford, England

:

Thank you Ken for your reply...
The below query does not work,

:

Firstly the table must have a primary key column or an otherwise uniquely
indexed column. If not add an autonumber column, MyID say. If the primary
key is a multi-column one then it can still be done, but its simpler with a
single column, so in this case add the autonumber column temporarily; you can
delete it again later. Then execute the following Delete query:

DELETE *
FROM YourTable As T1
WHERE EXISTS
(SELECT *
FROM YourTable AS T2
WHERE T2.SR = T1.SR
AND T2.Received = T1.Received
AND T2.MyID < T1.MyID);

Be sure to back-up the table first.

Ken Sheridan
Stafford, England

:

Hi,

I've a problem in my database,sometimes,due to the several equationas and
codes i'm using,records are duplicated,how can i run a delete query to delete
records that have the same value of the fields "SR" and "Received",I need to
keep one reocrd only and delete all the other repeated records.

Click to expand...

Click to expand...

Duplicated records

Pietro

Guest

Ken Sheridan

Pietro

Ken Sheridan

Pietro

Ken Sheridan

Pietro

Ken Sheridan