Identical records

S

Srowe

I wondering if there is a way to combine multiples of the same record into
one. Specifically I have a autonumber field for a subject. At present my
database is setup fairly poorly in that when you add a subject it creates a
ID for that subject. Problem is the subject could be in the database several
times therefore having several Id's. This makes it hard to retrieve all files
which may be associated to the individual.

I realize that at present my database tables are set up poorly and I am in
the process of correcting that. The database has been operational for 4 years
now and has thousands of entries. Once I get the tables normalized I want to
combine all of the identical records into one.

Thanks in advance.

Scott
 
T

Tom Wickerath

Hi Scott,

Does your table that includes the duplicate subject entries serve as a
parent table with related child tables? In other words, would you have
SubjectID foreign key values that need to be changed as well?


Tom Wickerath
Microsoft Access MVP
http://www.accessmvp.com/TWickerath/
__________________________________________
 
S

Srowe

Thanks Tom,

The SubjID field is in three other tables besides my subject table. I am
going to be adding another table to eliminate the duplication of a person's
records. Such as a SubjectInvolvement table or something like that. It will
contain will have its own autonum which will have very limited involvement.
The SubjId will then become a foreign key in this table. I will then base a
form off of the SubjectInvolvement table for data entry. I believe that this
will eliminate the assigning of a SubjId number evrytime you enter a subject.
If I'm not on the right track let me know.

Thanks in Advance.

Scott

Right now the database is setup such that it works as a 0ne-to-many
relationship instead of the many-to-many relationship it needs to be. The
subject table will essentially just be for the storage of the data. It will
have a majority of the search forms based off it.
 
N

nono

Srowe said:
I wondering if there is a way to combine multiples of the same record into
one. Specifically I have a autonumber field for a subject. At present my
database is setup fairly poorly in that when you add a subject it creates
a
ID for that subject. Problem is the subject could be in the database
several
times therefore having several Id's. This makes it hard to retrieve all
files
which may be associated to the individual.

I realize that at present my database tables are set up poorly and I am in
the process of correcting that. The database has been operational for 4
years
now and has thousands of entries. Once I get the tables normalized I want
to
combine all of the identical records into one.

Thanks in advance.

Scott
 
J

John W. Vinson

I wondering if there is a way to combine multiples of the same record into
one. Specifically I have a autonumber field for a subject. At present my
database is setup fairly poorly in that when you add a subject it creates a
ID for that subject. Problem is the subject could be in the database several
times therefore having several Id's. This makes it hard to retrieve all files
which may be associated to the individual.

I realize that at present my database tables are set up poorly and I am in
the process of correcting that. The database has been operational for 4 years
now and has thousands of entries. Once I get the tables normalized I want to
combine all of the identical records into one.

Thanks in advance.

Scott

If it's a simple case, say a table with SubjectID (autonumber primary key) and
Subject (text), with duplicates - you can get rid of all the *EXACT* dups by
creating a new table with the same fields; put a unique Index on the Subject;
and run an Append query to populate the new table from the old one. You'll get
an error message like "318 records were not added due to key violations" -
that's the duplicates going into the trash bin.

Delete the old table and rename the new one when you're CERTAIN that
everything's in.

This will not of course handle "almost duplicates" such as Subjects "Math",
"Maths", "Mathematics" and "Arithmatic". Getting rid of them will require a
USB interface (Using Someone's Brain).
 
T

Tom Wickerath

Hi Scott,
The SubjID field is in three other tables besides my subject table.

I kind of suspected that this might be the case. This is certainly going to
complicate matters, since you will need some way of updating the foreign key
values, after eliminating duplicated records in the parent table.
I am going to be adding another table to eliminate the duplication of a person's
records. Such as a SubjectInvolvement table or something like that.

I would hold off on any major re-designs of this nature, until you have the
immediate problem resolved, ie. all duplicate subject records eliminated, and
all related child record foreign key values updated to match the primary key
value of the record that you will be saving for each subject. Only then would
I consider adding any tables.

I think I would attack this issue in the following manner:

1.) First, make a back-up copy of your database, so that if something goes
wrong, you can easily recover.

2.) Open the Subject table in design view. Change the autonumber primary key
to a Number / Long Integer.

Note: You may have to temporarily delete relationships between this table
and any related child tables, in order to change the data type.

3.) Create (or re-create) relationships with enforced referential integrity
(RI) between this table and the three related tables you mentioned.
Hopefully, this will not be a big problem, and shouldn't be if you previously
had relationships with enforced RI. If you do run into problems at this
stage, then you'll need to run the unmatched query wizard to find any child
records in each table that do not have a matching parent record in the
Subject table. Do *NOT* check the option enable Cascade Delete.

4.) How many records are in your Subject table? If the number is not too
many (like say a couple hundred maximum), you might simply try opening this
table in normal preview mode, and sorting on the Subject text. Then try
deleting some of the obvious duplicate subject records, one at a time. If you
are able to delete, then no problem. However, if there are matching child
records in any of the three related child tables, the JET database engine
will prevent you from deleting the record, as long as you did not enable
Cascade Delete.

You might also start with a grouped query, just to get an idea of the scope
of your task at hand. To do this, create a new query in design view. Select
the Subject table. Add the Subject field two times to the QBE (Query by
Example) grid. Do not add the autonumber field or any other fields at this
time. Click on the Sigma button. You should see a new row in the QBE grid
that reads "Total" and defaults to "Group By" for the two fields. Change the
second field from Group By to Count. Add an ascending sort to the first
field. Run the query. This should tell you how many duplicate occurances of
each subject are present.

Note: you can add a criteria to the field that includes the Count, with >1,
to filter out any records that have no duplicates. Also, this will only count
*exact* duplicate entries in the Subject field.

5.) For duplicate records that you cannot delete, you will need to update
the foreign key values in the related tables, such that they match the
primary key value of the record that you will retain for each unique subject.
If this looks like it is going to be a very difficult task, due to the number
of duplicate records with related child records, then post back with this
input. There may be a way that we can figure out how to design an appropriate
update query--I simply don't know yet, as I haven't tried setting this up.
Note: Such a query would likely involve a subquery, to locate the Top 1
record for each unique subject entry.

6.) After you have all duplicate records successfully removed from your
Subject table, copy the structure only of this table to a new empty table.
Open the empty table in design view, and add an autonumber primary key with
the same name as the original field. Delete the Long Integer key field in the
new empty table. Then run an append query to append records from your old
Subject table to the new empty subject table. You should be able to append
the numeric long integer values from the old table to the autonumber field in
the new table.


Tom Wickerath
Microsoft Access MVP
http://www.accessmvp.com/TWickerath/
__________________________________________
 
S

Srowe

Thanks Tom,

I have done what you have said so far. Unfortuneatly it has revealed further
problems with my data entry. It appears that because we have several users
and that I have just recently updated the database to autofill information
about a subject without the user having to input every value, the subject not
only has several different id numbers but it has several different ways they
have been inputed. Whether it is from lazy users or for whatever reason.

I have also had to add the subjects firstname to the query in order to
distinguish between possible same last names ie.(smith, john; smith, steve).

After that is all said and done I am left with 750 duplicate entries of the
same last name and first name, again all having different id numbers. Most
subjects have 2 or three different entries with the most being 12 duplicates
of the same subject.

Can you give me further instruction as to how to esh the duplicates together?

P.S. I have managed to limit all subjects to the subject table. the SubjID
field now only appears in this table.

Thanks,

Scott
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top