Design help

M

MikeB

I'm pottering about creating a chess database for a small club at my
daughter's school.

What I have is working reasonably well, but there are a few problems
and issues with it that only the creator of such a database would put
up with.

So I'm thinking of a 2.0 version with some of the design issues
removed.

One of the most vexing problems I have is how to record games.

I have a table named "Games", and it has a field for correlating the
game to a particular match, a field for the white player an the black
player as well as their scores, the number of moves, the time and a
field to store the moves in (I'm not really using the last three
moves, I just have it there in case my players get more
sophisticated).

I then run a union query to extract the players by getting first all
the white players and then all the black players..

What I"m having difficulties with is how to treat a bye (when a player
doesn't have an opponent and advances to the next round with a point
as if they had won). In my current design, I have a Player named "Bye"
and then I simply enter this player into the Games record as the
opponent. This has the problem that the Bye player is onthe list of
players (doing rather poorly, having never won a game), but also it
forces me to record a color for the player playing the Bye. This is a
problem, because player should play an even number of black and white
games, and if I have him recorded as having played white against Bye,
he hasn't really played white and I have to keep track of that
throughout a tournament (many games in many matches).

I'm thinking of adding a field into the Games table "IsBye", that will
indicate if that particular Game is in essence a Bye. That creates
difficulties in running the Union Query to create the playerGames
table, I'm thinking of creating a separate Byes table, but that
creates difficulties as I then cannot add Byes using the same form as
I use to add Games.

I know this is rambling and I apologize. If anyone has some advice,
I'd really appreciate it. Thanks.
 
K

Ken Sheridan

It seems to me that your Games table suffers from a fundamental design flaw
in that you are 'encoding data as columns headings'. By this I mean that you
have separate columns for white and black players and their scores. It’s a
fundamental principle of the databases relational model that data should be
stored only as values at column positions in rows in tables; its known as the
'information principle'.

Your Games table should have columns representing the attributes of the
games entity type per se. One would be a primary key, e.g. GameID, and
another a foreign key referencing the key of a Matches table of course, but
there may well be others depending on what attributes of each match you wish
to record.

The players' participation in each game can be recorded in another table
GamePlayers say, with a foreign key GameID column and columns for the
player's ID (referencing a Players table), colour played, and score. I'd
suggest that legitimate values for colour played be White, Black and N/A
(selected from a combo box), the last for use in byes. This overcomes the
need to give the player a dummy colour played in the case of a bye, but also
identifies which games are byes of course. A bye can also be identified by
the fact that the count of rows in the GamePlayers table with a particular
GameID is 1 rather than 2. The PlayerID, MatchID and ColourPlayed columns
should be the subject of a unique index as these values in combination must
be unique.

For recording moves, if you do implement that part of the database, you
could have a further table Moves with a composite foreign key of PlayerID and
GameID referencing the similarly composite primary key of GamePlayers. You
could record a move as a single text value using the conventional notation,
or with multiple columns. I don't see a lot of advantage in the latter,
however. A column to record the sequence of moves per player in the game
would be needed of course. This could either be a simple sequential
numbering per player/game, which can easily be computed and inserted
automatically, or a date/time value, which would enable time analysis of
moves.

I should emphasize that the above merely represents my initial thoughts on
an appropriate model, and would need to be rigorously examined to be sure it
can return the information you require out of the database. I think it
provides the basis of a model for storing all the data as values in correctly
normalized tables, though, and as well as enabling you to store (and
retrieve) the true facts with regard to games, both those played and byes,
also avoids the need for a Union operation to return data for both players in
a game.

Ken Sheridan
Stafford, England
 
M

MikeB

Thanks for responding. I think I see the point you are trying to make.
What I had difficulty with was to devise a form that would add two
rows to the PlayerGames table, hence I devised the table Games, that
has both a white and a black player column. I can easily design a form
that allows for data entry into this table. I then "manually" create
the PlayerGames table by running a union query over games twice, once
extracting the white player and then extracting the black player. I
guess if I run it a third time extracting bye players I'm going to be
on the right track?

This was part of my concern in the mislabeling of the columns. I feel
it is OK to have a white column for white playerID and a black column
for black playerID. But if I now have to look at another column to
determine if the white column really contains a white playerID or a
bye playerID, then I think the design is messed?

As for the moves column, I envisage that this should stay in the Game
table - both sets of players have the same sequence of moves to
comprise a game and I'll simply store an object (to be devised) that
contains all the moves for the game.

Hmm... thinking about this, I might have to create a game status of
"adjourned" if I'm going to be really serious about this. The kids in
school play fast, but for tournaments, some games may be adjourned for
a while.
 
K

Ken Sheridan

A data entry form for games with the normalized tables I suggested is no
problem; you'd have a form based on the games table in single form view and a
subform based on the GamePlayers table in continuous form view, with GameID
as the LinkMasterFields and LinkChildFields properties. This would have two
rows normally of course, but only one for a bye.

'Adjourned' would be a value of a Status attribute, i.e. column, of the
games table of course, so that's easy to implement. Strictly speaking all
legitimate values of Status should be unique values in rows in a GameStatuses
table which would be related to games on the Status columns with referential
integrity enforced. The same is true of ColourPlayed, where 'White',
'Black' and 'N/A' would be unique values in rows of a ColoursPlayed table, in
this case related to GamePlayers. You could of course, simply have value
lists as the RowSource properties of Status and ColourPlayed combo boxes, but
this would violate the Information Principle which I referred to in my first
reply as it would involve storing data in a property of controls, not as
values in tables. If used at all such value lists are best used only for
immutably fixed sets of data from the external world, e.g. months or days of
the week. Even then the values allowed should be constrained in the table
definition, not merely via the interface.

Ken Sheridan
Stafford, England
 
M

MikeB

I read what you write, but half of it just doesn't make sense to me.
Perhaps I just don't know enough about Access to understand what
you're telling me.

*How* do I have a subform in "continuous form view" so that it shows
fields for the two playerIDs and the game result? Right now I have a
simple form that has three fields - White, Black and Result which is a
combobox. White and Black are drop-down lists of all the players'
names. Perhaps I should put up a screenshot somewhere?

I really don't see that I should have a table to contain 3 rows
"White," "Black," and "Bye", if there is never any likelihood of those
values changing or additional values being added. Then every table has
to be a query so that they can retrieve the proper values for these
fields.
 
K

Ken Sheridan

I read what you write, but half of it just doesn't make sense to me.
Perhaps I just don't know enough about Access to understand what
you're telling me.

Its more a question of understanding the principles of the relational
database model than of Access. The model, which has been developed through
many years of research, and is the subject of a vast academic literature
since E. F. Codd's first paper in 1970, provides the theoretical basis for
all relational database management systems like Access. I'll take your two
points separately:
*How* do I have a subform in "continuous form view" so that it shows
fields for the two playerIDs and the game result? Right now I have a
simple form that has three fields - White, Black and Result which is a
combobox. White and Black are drop-down lists of all the players'
names. Perhaps I should put up a screenshot somewhere?

No screenshot necessary, I see exactly what you are doing.

The subform would show a row for each player not separate fields in one row.
The GamePlayers table in fact models a many-to-many relationship type
between players and games, and each column in the table, PlayerID,
ColourPlayed, Score etc is an attribute of the relationship type. A
relationship type is really just a special kind of entity type, which are
what tables model in a relational database. The result of each should not be
stored but computed from the two (or one in the case of a bye) rows per game
in GamePlayers. This could be shown on the parent form by means of an
unbound text box, whose ControlSource would be a function (easily written)
which returns the result on the basis of the row(s) in GamePlayers for the
current game. To store the result in a column in a row in a base table would
introduce redundancy and the risk of inconsistent data.
I really don't see that I should have a table to contain 3 rows
"White," "Black," and "Bye", if there is never any likelihood of those
values changing or additional values being added. Then every table has
to be a query so that they can retrieve the proper values for these
fields.

Its nothing to do with the possibility of the values changing, it’s a
question of maintaining the integrity of the data. It allows referential
integrity to be enforced, thus preventing invalid values being entered in
GamePlayers. There would be no need to include the ColoursPlayed table in a
query as the keys are 'natural' ones not surrogate numeric keys, so "White,"
"Black," and "Bye" are the values in the foreign key ColourPlayed column in
GamePlayers, referencing the primary key of ColoursPlayed. Anyway, its more
usual to use a query than a table as a RecordSource etc, so even where the
use of a surrogate key makes the inclusion of the referenced table necessary,
this is perfectly normal. Without a ColoursPlayed table you'd have to limit
what values can be entered in GamePlayers by a validation rule in the table
definition, which is a reasonable alternative. What should not be done,
however, is to rely solely on the interface to limit the input, i.e. by a
value list as the RowSource property of a combo box.

Ken Sheridan
Stafford, England
 
M

MikeB

Its more a question of understanding the principles of the relational
database model than of Access. The model, which has been developed through
many years of research, and is the subject of a vast academic literature
since E. F. Codd's first paper in 1970, provides the theoretical basis for
all relational database management systems like Access. I'll take your two
points separately:


No screenshot necessary, I see exactly what you are doing.

The subform would show a row for each player not separate fields in one row.
The GamePlayers table in fact models a many-to-many relationship type
between players and games, and each column in the table, PlayerID,
ColourPlayed, Score etc is an attribute of the relationship type. A
relationship type is really just a special kind of entity type, which are
what tables model in a relational database. The result of each should not be
stored but computed from the two (or one in the case of a bye) rows per game
in GamePlayers. This could be shown on the parent form by means of an
unbound text box, whose ControlSource would be a function (easily written)
which returns the result on the basis of the row(s) in GamePlayers for the
current game. To store the result in a column in a row in a base table would
introduce redundancy and the risk of inconsistent data.

Right, please bear with me. I have the feeling this is an important
concept that I'm just not getting. It may explain why I can't use the
Games table, but have to run a union query over the games table to
derive the source data for subsequent reports I'm creating.

Today I have a table [Games] with the following fields: GameID,
MatchID, WhitePlayerID, WhiteScore, BlackPlayerID, BlackScore Result,
Moves, Time

If I understand you correctly, I need to change this to:

Table [Games] with fields: GameID, MatchID, Moves, Time
and two records in Table [GamesPlayers], with the following fields:
GameID, PlayerID, Color, Score.

Then I have to design a form with a subform as follows: Form Games
allows me to Add a new Game and then in the subform (which I presume I
have to open in Datasheet mode to see two rows at the same time) I
enter in the PlayerID (how do I get them in the form of a drop-down
list?) and the result. But how does the form ensure that two players
get entered for each game, unless the Color field says "Bye?" How does
the form ensure that the sum of the two records' score is 1,
appropriately apportioned between white and black depending on who won/
draw?

I've looked in the Access books I have and this doesn't seem to be
covered.

As I type this, a thought crossed my mind. Did you intend (and I'm not
knowledgeable enough) that the data is entered in the Games Form and
then programatically propagated to the two subrecords in GamesPlayers?
I can kind of see how that would work, but it seems to be a SMOP
(small matter of programming) that I'm not familiar with.

Its nothing to do with the possibility of the values changing, it's a
question of maintaining the integrity of the data. It allows referential
integrity to be enforced, thus preventing invalid values being entered in
GamePlayers. There would be no need to include the ColoursPlayed table in a
query as the keys are 'natural' ones not surrogate numeric keys, so "White,"
"Black," and "Bye" are the values in the foreign key ColourPlayed column in
GamePlayers, referencing the primary key of ColoursPlayed. Anyway, its more
usual to use a query than a table as a RecordSource etc, so even where the
use of a surrogate key makes the inclusion of the referenced table necessary,
this is perfectly normal. Without a ColoursPlayed table you'd have to limit
what values can be entered in GamePlayers by a validation rule in the table
definition, which is a reasonable alternative. What should not be done,
however, is to rely solely on the interface to limit the input, i.e. by a
value list as the RowSource property of a combo box.

This I get. I'll look into changing it in my next rework of the setup.

Thanks.
 
K

Ken Sheridan

It might help if we first have a brief look at the underlying principles of
the relational model as a grasp of the theoretical basis makes the practical
application of those principles a whole lot easier.

The database relational model was originally proposed by E F Codd in a paper
in 1970 in the journal 'Communications of the Association for Computing
Machinery'. Since then there has been a vast amount of further theoretical
work, and the relational model has shown itself to be a robust one. Without
going too deeply into the theoretical basis, which can be quite abstract, a
relational database in essence models a part of the real world in terms of
its entity types and the relationship types between them. Note the inclusion
of the word 'type' in both cases here. While its almost always used in the
former case, its often omitted in the latter case. This is a little bit
sloppy but not too important. When one talks about a 'relationship' it
really refers to a relationship value. As an example 'marriage' is a
relationship type, but my being married to my wife Fiona is a relationship
value, represented by our names on the marriage certificate, which is the
physical equivalent of a row in a Marriages table with columns Husband and
Wife, each referencing the primary key of a table People. This is a
many-to-many relationship type (I've been married twice so would be in two
rows, my first wife would also be in two rows as she remarried too). It is
resolved into two one-to-many relationship types, People to Marriages in each
case, in one case via the Husband column in the other via the Wife column.

In a relational database tables model Entity Types. In the above example
People is an entity type, modelled by the People table. Marriage is also an
entity type, modelled by the Marriages table. As we've seen its also a
relationship type. In fact a relationship type is just a special kind of
entity type.

Each column in a table represents an attribute type of each entity type, so
attribute types of People might be FirstName, LastName, DateOfBirth etc.
This table would also have a PersonID numeric column (usually an autonumber)
as its primary key as names are not unique. Each row in a table represents
one instance of the entity type, and the attributes of each instance are
represented by values at column positions in the row. This is the only way
that data can be legitimately stored in a relational database.

Its important that there is no redundancy in the information content of the
database. This is achieved by the process of 'normalization'. Normalization
is based on a set of 'normal form's ranging from First Normal Form (1NF) to
Fifth Normal Form (5NF). There is also one called Boyce/Codd Normal Form
(BCNF) which was inserted when it was found that the original Third Normal
Form was deficient; it didn't cater satisfactorily for tables with two or
more candidate keys where thee keys were composite and overlapped, i.e. had a
column in common. I won't go into the details of normalization here; you'll
find it written up in plenty of places.

To see an example of redundancy and therefore a table which is not properly
normalized take a look at the Customers table in the sample Northwind
database which comes with Access. You'll see that it includes City, Region
and Country columns. If you look at its data you'll se that we are
redundantly told that São Paulo is in province SP which is in country Brazil
4 times. This is not just inefficient, it is dangerous as it leaves the
table open to inconsistent data being entered. There is nothing to stop
somebody putting São Paulo in the UK, USA or in each in separate rows in the
table for instance. To normalize the table it should be decomposed into
Customers, Cities, Regions and Countries tables, each of the first three with
a foreign key referencing the primary key of the next table up in the
hierarchy.

So, lets look at your scenario and identify the entity types and
relationship types involved, and the tables needed to model them. The role
of the first four below, and the columns they'd have will be obvious I think:

1. Players.

2. Matches

3. ColoursPlayed

4. GameStatuses

5. Games. This table has rows representing each game, with columns
representing the attributes of the game as a whole. One of these would be
GameStatus as this is an attribute of the game, not of a particular player's
involvement in it. In the jargon its said to be 'functionally dependent' on
the key of the table. This would be a foreign key column referencing the
primary key of the GameStatuses table. You are in a better position to
identify what other columns might be in this table.

6. GamesPlayers. This models a relationship type between Players and Games
so would have foreign key columns referencing the primary keys of each, e.g.
PlayerID and GameID. Together these form the composite primary key of this
table as the combined values PlayerID and GameID must be unique. It also has
attributes ColourPlayed and GameStatus, which are also foreign keys
referencing the primary keys of the ColoursPlayed and GameStatuses tables, so
in fact the relationship type is really between all four tables. And it has
a Score column of course. I'll say more about Moves below.

7. Moves. How you include these depends on the degree of analysis you
might want to make of moves. If you simply want to record the moves for a
game, without being able to compute any analysis of the moves, then you can
simply enter than into a memo field using conventional notation. This would
be an attribute of the game as a whole so the memo field would be in the
Games table. If on the other hand you want to be able to undertake some
analysis on the moves then you'd have a separate Moves table with one row per
move, with columns Move, TimeMoved (don't call the column Time; that's the
name of a built in function). This table would have a composite foreign key
of PlayerID and GameID referencing the similarly composite primary key of
GamesPlayers.

Now as regards the interface the main input would be via a Games form, based
on the games table and a GamesPlayers subform within it, linked on GameID.
The data would not be inserted into rows in the subform via code but manually
by selecting the name of each player, the colour, status and entering the
score per player. By uniquely indexing on GameID, PlayerID and ColourPlayed
in the table definition both players cannot be given the same colour. To
ensure only one row per Bye is entered and two rows, no more no less, per
played match are entered would require some validation code in the subform's
BeforeUpdate event procedure. The same event procedure would validate the
scores, though this would only kick in when the second player is entered of
course as even Access does not have powers of prescience! Code in the
AfterUpdate event procedure and the parent form's Current event procedure
could also set the subform's AllowAdditions property to True or False as
appropriate. I'd suggest that you concentrate on getting the form/subform
set up and working before tackling the validation code, however. We can
always come back to that later.

Finally one point which has just occurred to me, arising from my own
ignorance of how chess matches are organised (I'm not a chess player, the
sort of matches I've played in generally took place in a muddy field and
involved beating the opposition to a pulp!). Is a match a set of games
between the same two players or is it a set between different combinations of
players? What I'm really asking is whether a match is between two teams or
two individuals. If it’s the latter then the Matches table models a
relationship between two instances of Players, so the Matches table would
include two columns Player1 and Player2 say, both referencing the PlayerID
key of Players. This could then be used to control which players can be
selected in the GamesPlayers subform, i.e. the combo box would only list the
two players for the match in question.

Ken Sheridan
Stafford, England

MikeB said:
Right, please bear with me. I have the feeling this is an important
concept that I'm just not getting. It may explain why I can't use the
Games table, but have to run a union query over the games table to
derive the source data for subsequent reports I'm creating.

Today I have a table [Games] with the following fields: GameID,
MatchID, WhitePlayerID, WhiteScore, BlackPlayerID, BlackScore Result,
Moves, Time

If I understand you correctly, I need to change this to:

Table [Games] with fields: GameID, MatchID, Moves, Time
and two records in Table [GamesPlayers], with the following fields:
GameID, PlayerID, Color, Score.

Then I have to design a form with a subform as follows: Form Games
allows me to Add a new Game and then in the subform (which I presume I
have to open in Datasheet mode to see two rows at the same time) I
enter in the PlayerID (how do I get them in the form of a drop-down
list?) and the result. But how does the form ensure that two players
get entered for each game, unless the Color field says "Bye?" How does
the form ensure that the sum of the two records' score is 1,
appropriately apportioned between white and black depending on who won/
draw?

I've looked in the Access books I have and this doesn't seem to be
covered.

As I type this, a thought crossed my mind. Did you intend (and I'm not
knowledgeable enough) that the data is entered in the Games Form and
then programatically propagated to the two subrecords in GamesPlayers?
I can kind of see how that would work, but it seems to be a SMOP
(small matter of programming) that I'm not familiar with.
 
M

MikeB

It might help if we first have a brief look at the underlying principles of
the relational model as a grasp of the theoretical basis makes the practical
application of those principles a whole lot easier.

Wow. Ken, thanks. It's going to take me a few days to digest all of
this.
 
M

MikeB

Ken, I've looked at your recommendations and have done some playing
around. I think I understand the theory of what you're saying. In
fact, I have been doing something akin to your recommendations "behind
the scenes" since I ran a Union Query over my old Games table twice,
extracting and joining first games played by white and then games
played by black. So in essence I ended up with a PlayerGames Query
that I based many of my reports on.

Now I'm trying to implement a PlayerGames table directly.

If the game is a bye, I have to add one record to PlayerGames. If the
game is a normal game, I have to add two records to PlayerGames and I
have to make sure that a) one record indicates white, the other black
and b) that the sum of the points awarded in the two records is 1.
(1/0, 0.5/0.5, or 0/1 - depending on white win, draw or black win).

I don't understand how I design a Game form that will allow me to
enter data in one or two subforms for Playergames. It seems to me that
I have to do extensive coding to accomplish this? Am I correct?

If I am, I'd much rather revert to my previous incorrect model where
the player results are stored in the Games table (that way I can do
all the validation in the form) and then run my "extract" or Union
Query to create PlayerGames as a query for further processing.

Any comments or more helpful advice?

PS. I'm now the proud owner of about 5 Access books. I started with
just one, but things are getting so complex I keep needing more
information. One is on programming, but I think it's too basic, it is
trying to teach VBA, not the data access part of interacting with the
database.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top