Help with Relationships

  • Thread starter Thread starter ryguy7272
  • Start date Start date
R

ryguy7272

Hello, I am a little new to creating relationships between tables. I have
two data tables with similar data; one has sales data for 2007 and one has
sales data for 2008. The table with 2007 sales data has a field named 2007,
and the data type is Text. Similarly, the table with 2008 sales data has a
field named 2008, and the data type is Text. I created another table, which
has just three fields, ID, 2007, and 2008. ID is of DataType AutoNumber, and
2007 is Text, and 2008 is also Text. I thought I could create a relationship
between my 2007 sales data table and my 2008 sales data table, via this third
table, but it doesn’t seem to be working at all. When I drag/drop the 2007
field for the Sales data table to the Years data table, I keep getting
messages saying that ‘no unique index is found for the referential field of
the primary table’. I guess Referential Integrity is violated, but I can’t
figure out how to overcome this. Can someone point out my mistake?

Regards,
Ryan---
 
Your inquiry raises a lot of questions, but here are a few thoughts.

A few of the omissions from your description may also be omissions from your
overall process on this.

You need to start with a precised definition of what data you want to
record, and what do you want your database application to do with that data.
Then pick a table / table relationship structure that is a good foundation
for that. And then implement that stucture in Access.

But a quick solution to your issue of the moment (without knowing the
specifics) would be to try making one or both of the fields in the
relationship into a primary key.
 
Your problem comes from a mis-understanding of data structure, and how to
normalise it. Instead of trying to set up a relationship between your
existing tables - which will probably not work; that's not how relationships
work in a relational database - you would do better to normalise your data.

A real give-away to non-normalised data is when you have similar data
fields, but separate tables for the same data for different years. Another
(probably not the case here, but I can't tell because you don't give any
fieldnames) is when you have fields with names such as "child1", "child2",
.... , "childn". In both cases, you are storing data in table or field
names.

A quick-and-dirty normalisation method is to combine the data from both
tables via a Union query, adding a new field to provide the Year data for
each record.

A better way is to set up a new table, including the Year field, and append
the data from each of your existing tables, with a constant entry for the
Year field from each of the existing tables. There are some potential
problems with this, though: eg. if the existing tables have other tables
related to them via a link from a primary key field in the original tables
(perhaps, and most commonly, an autonumber) you will need to store the old
PK field, and then use that as a link in an update query to subsequently fix
the related tables' data.

I suggest that you read up on data normalisation as a first step; you'll
find some useful references here:
http://www.accessmvp.com/JConrad/accessjunkie/resources.html#DatabaseDesign101

HTH,

Rob
 
Ryan:

As Rob has explained your problem stems from a design flaw in that you are
encoding data, in this case the year of sale by the looks of it, as table
names. Whether you are also encoding data a column names is hard to say. In
any event you should get the data into from both columns into one table in
which the year of sale is indicated by a value in a column. If each sale
record has a SaleDate or similar column of date/time data type already then
this does this and you can always get the year with Year(SaleDate) in a query
for instance.

What kind of values do the 2007 and 2008 columns in the two tables hold?
Presumably they represent the same kind attribute type, e.g. a customer or
product, but whatever the case the column should be titled so that it
describes the attribute type, e.g. CustomerID, ProductID etc.

You can create a single table by first creating an empty table and then
appending all rows from each of the current two tables into it with an append
query. If the current tables don't have anything to indicate the year, then
add a SaleYear column of integer number data type to the new table and append
a constant 2007 or 2008 value in each of the two append queries, e.g. taking
a very simple example of tables Sales2007 and Sales 2008 with columns
ProductID, CustomerID, Quantity, UnitPrice, then the append queries to insert
the reows into a single table Sales would be:

INSERT INTO Sales(ProductID, CustomerID, Quantity, UnitPrice, SaleYear)
SELECT ProductID, CustomerID, Quantity, UnitPrice, 2007
FROM Sales2007;

and:

INSERT INTO Sales(ProductID, CustomerID, Quantity, UnitPrice, SaleYear)
SELECT ProductID, CustomerID, Quantity, UnitPrice, 2008
FROM Sales2008;

Remember that if object names include spaces or other special characters you
have to wrap them in brackets, e.g. [Sales 2007]. If in doubt use the
brackets.

Once you have a single table like this you can use queries to extract data
to suit your requirements, including aggregating data, comparing data between
one year and another etc. You might need to do a bit more work to normalize
the table by decomposing it into several related tables. If for instance you
are repeating customer names and addresses in more than one row of the sales
table these should be split off into a Customers table with one row per
customer, using a foreign key CustomerID in the Sales table to reference the
primary key of the Customers table. the same would go for products, but one
thing which would be retained in Sales as well as being included in Products
is the unit price per product as this will change over time, but you'd want
the original prices at the time of each sale to remain static. This is
what's known as a 'functional dependency' in the jargon; the current price is
functionally dependant on the key of Products, but the sale price is
functionally dependant on the key of Sales. This is a very important concept
in relational database design and underlies the whole process of
'normalization'.

The above is not intended to reflect your actual business model or tables of
course, but only to indicate the general nature of the task to achieve a set
of correctly normalized tables. Rob has given you a link on normalization,
but in addition to that here's my own potted summary which I use to outline
the subject:


"Normalization is the process of eliminating redundancy from a database, and
involves decomposing a table into several related tables. In a relational
database each table represents an entity type, e.g. Contacts, Companies,
Cities, States etc. and each column in a table represents an attribute type
of the entity type, e.g. ContactID, FirstName and LastName might be attribute
types of Contacts and hence columns of a Contacts table. Its important that
each attribute type must be specific to the entity type, so that each 'fact'
is stored once only. In the jargon its said that the attribute type is
'functionally dependent' solely on the whole of the primary key of a table.

To relate tables a 'referencing' table will have a foreign key column which
makes the link to the 'referenced' table, e.g. a Contacts table might have a
CompanyID column as a foreign key, while a Companies table has a CompanyID
column as its primary key. Consequently no data other than the CompanyID
needs to be stored in a row in the Contacts table for all the company
information for that contact to be known; its available via the relationship
and can be returned in a query joining the two tables on the CompanyID
columns.

Similarly the Companies table might have a CityID column and the Cities
table a StateID column. If its an international database the States (or more
generically Regions) table would have a CountryID referencing the primary key
of a Countries table. So via the relationships, simply by entering (in
reality this would be selected from a list of Companies in a combo box, not
typed in) a CompanyID in the Contacts table the location of the contact's
company is also known. Redundancy, and therefore repetitive data entry is
avoided.

To see how a database is made up of related tables take a look at the sample
Northwind database. Its not completely normalized in fact (deliberately so
for the sake of simplicity) but illustrates the main principles of how tables
representing entity types relate to each other. An example of its lack of
proper normalization can be found in its Customers table. You'll see that
this has City, Region and Country columns so we are told numerous times that
São Paulo is in SP region (as is Resende) and that SP region is in Brazil.
Not only does this require repetitive data entry, but more importantly it
opens up the risk of 'update anomalies', e.g. it would be perfectly possible
to put São Paulo in California in one row and California in Ireland! Proper
normalization as I described above would prevent this as the fact that São
Paulo is in SP region would be stored only once in the database as would the
fact that SP region is in Brazil and California is in the USA."

Ken Sheridan
Stafford, England
 
I just bought the MS Access 2007 Bible yesterday; this seems like a very
worthwhile investment. I'll read that, and I'll study all of these responses
more in the coming days. Thanks guys for the help and advice!!

Regards,
Ryan--

--
RyGuy


Ken Sheridan said:
Ryan:

As Rob has explained your problem stems from a design flaw in that you are
encoding data, in this case the year of sale by the looks of it, as table
names. Whether you are also encoding data a column names is hard to say. In
any event you should get the data into from both columns into one table in
which the year of sale is indicated by a value in a column. If each sale
record has a SaleDate or similar column of date/time data type already then
this does this and you can always get the year with Year(SaleDate) in a query
for instance.

What kind of values do the 2007 and 2008 columns in the two tables hold?
Presumably they represent the same kind attribute type, e.g. a customer or
product, but whatever the case the column should be titled so that it
describes the attribute type, e.g. CustomerID, ProductID etc.

You can create a single table by first creating an empty table and then
appending all rows from each of the current two tables into it with an append
query. If the current tables don't have anything to indicate the year, then
add a SaleYear column of integer number data type to the new table and append
a constant 2007 or 2008 value in each of the two append queries, e.g. taking
a very simple example of tables Sales2007 and Sales 2008 with columns
ProductID, CustomerID, Quantity, UnitPrice, then the append queries to insert
the reows into a single table Sales would be:

INSERT INTO Sales(ProductID, CustomerID, Quantity, UnitPrice, SaleYear)
SELECT ProductID, CustomerID, Quantity, UnitPrice, 2007
FROM Sales2007;

and:

INSERT INTO Sales(ProductID, CustomerID, Quantity, UnitPrice, SaleYear)
SELECT ProductID, CustomerID, Quantity, UnitPrice, 2008
FROM Sales2008;

Remember that if object names include spaces or other special characters you
have to wrap them in brackets, e.g. [Sales 2007]. If in doubt use the
brackets.

Once you have a single table like this you can use queries to extract data
to suit your requirements, including aggregating data, comparing data between
one year and another etc. You might need to do a bit more work to normalize
the table by decomposing it into several related tables. If for instance you
are repeating customer names and addresses in more than one row of the sales
table these should be split off into a Customers table with one row per
customer, using a foreign key CustomerID in the Sales table to reference the
primary key of the Customers table. the same would go for products, but one
thing which would be retained in Sales as well as being included in Products
is the unit price per product as this will change over time, but you'd want
the original prices at the time of each sale to remain static. This is
what's known as a 'functional dependency' in the jargon; the current price is
functionally dependant on the key of Products, but the sale price is
functionally dependant on the key of Sales. This is a very important concept
in relational database design and underlies the whole process of
'normalization'.

The above is not intended to reflect your actual business model or tables of
course, but only to indicate the general nature of the task to achieve a set
of correctly normalized tables. Rob has given you a link on normalization,
but in addition to that here's my own potted summary which I use to outline
the subject:


"Normalization is the process of eliminating redundancy from a database, and
involves decomposing a table into several related tables. In a relational
database each table represents an entity type, e.g. Contacts, Companies,
Cities, States etc. and each column in a table represents an attribute type
of the entity type, e.g. ContactID, FirstName and LastName might be attribute
types of Contacts and hence columns of a Contacts table. Its important that
each attribute type must be specific to the entity type, so that each 'fact'
is stored once only. In the jargon its said that the attribute type is
'functionally dependent' solely on the whole of the primary key of a table.

To relate tables a 'referencing' table will have a foreign key column which
makes the link to the 'referenced' table, e.g. a Contacts table might have a
CompanyID column as a foreign key, while a Companies table has a CompanyID
column as its primary key. Consequently no data other than the CompanyID
needs to be stored in a row in the Contacts table for all the company
information for that contact to be known; its available via the relationship
and can be returned in a query joining the two tables on the CompanyID
columns.

Similarly the Companies table might have a CityID column and the Cities
table a StateID column. If its an international database the States (or more
generically Regions) table would have a CountryID referencing the primary key
of a Countries table. So via the relationships, simply by entering (in
reality this would be selected from a list of Companies in a combo box, not
typed in) a CompanyID in the Contacts table the location of the contact's
company is also known. Redundancy, and therefore repetitive data entry is
avoided.

To see how a database is made up of related tables take a look at the sample
Northwind database. Its not completely normalized in fact (deliberately so
for the sake of simplicity) but illustrates the main principles of how tables
representing entity types relate to each other. An example of its lack of
proper normalization can be found in its Customers table. You'll see that
this has City, Region and Country columns so we are told numerous times that
São Paulo is in SP region (as is Resende) and that SP region is in Brazil.
Not only does this require repetitive data entry, but more importantly it
opens up the risk of 'update anomalies', e.g. it would be perfectly possible
to put São Paulo in California in one row and California in Ireland! Proper
normalization as I described above would prevent this as the fact that São
Paulo is in SP region would be stored only once in the database as would the
fact that SP region is in Brazil and California is in the USA."

Ken Sheridan
Stafford, England

ryguy7272 said:
Hello, I am a little new to creating relationships between tables. I have
two data tables with similar data; one has sales data for 2007 and one has
sales data for 2008. The table with 2007 sales data has a field named 2007,
and the data type is Text. Similarly, the table with 2008 sales data has a
field named 2008, and the data type is Text. I created another table, which
has just three fields, ID, 2007, and 2008. ID is of DataType AutoNumber, and
2007 is Text, and 2008 is also Text. I thought I could create a relationship
between my 2007 sales data table and my 2008 sales data table, via this third
table, but it doesn’t seem to be working at all. When I drag/drop the 2007
field for the Sales data table to the Years data table, I keep getting
messages saying that ‘no unique index is found for the referential field of
the primary table’. I guess Referential Integrity is violated, but I can’t
figure out how to overcome this. Can someone point out my mistake?

Regards,
Ryan---
 
Back
Top