a spreadsheet is not a database

  • Thread starter Thread starter Nils
  • Start date Start date
N

Nils

I'm trying to convince someone about the virtues of designing a proper
relational database with tables, relationships, referential integrity
etc. instead of dumping data into a flat-file Excel sheet.

Does anyone know of some good sources of info that goes into detail
comparing/contrasting how spreadsheets work vs. databases and the
pitfalls of spreadsheets for data storage.

I'm trying to create an argument for investing time into
analysis/normalization phase and show it's not wasted.

This person doesn't know much about relational databases.

Thanks in advance.
 
A key advantage (as you would know) is avoiding duplicated data. Take a
medical system, for example. Say John Smith's test are sometimes
recorded against John Smith, sometimes J. Smith, sometimes John A
Smith, and sometimes John A. Smith.

(1) Your database now has 4 different patients, instead of just one;

(2) That one patient's test results, might not be properly correlated.
(Perhaps two of the results are unimportant when they occur in
different people, but critically important when they occur in the same
person.)

(3) His accounts will be wrong. (You need to discount the amount when a
patient has two visits on the same day - but the database does not
realiz that the two visits were for the same individual.)

(4) The statistics are wrong. (4 patients instead of just one.)

And so on. You could point out that these kinds of problems ar
/;inevitable/ in a spreadsheet solution, or an unnormalized database,
or with /any/ storeage system which lets the same information (like
patient name) be duplicated in several places.

But these problems are far less likely to occur, in a properly
normalized system where each single piece of information (like a
patint's name), is tored only once.

HTH,
TC [MVP Access]
 
I'm trying to convince someone about the virtues of designing a proper
relational database with tables, relationships, referential integrity
etc. instead of dumping data into a flat-file Excel sheet. ....

This person doesn't know much about relational databases.

I am not sure that you necessarily have to:

Excel databases are fast to set up and easily corrected when mistakes are
found;

Excel does not have strong typing, so if you need a row number "124b" in
between 124 and 125 then it's easy;

Excel does a load of calculating as it goes, which is far more flexible
than an SQL query will ever be;

Excel displays a row-and-column view of the data which makes many naive
users happier than dropping everything into a "bucket of records";

Excel also has lots of simple user facilities (that's simple facilities
for users, not the other way round...) like automatic forms, query by
example, quick sorting, autofilter, and so on that are probably even
easier than their Access counterparts;

On the other hand, you could offer a primer in R theory, lots of practice
and experience in database and data semantics, requirements analysis,
data flow dynamics etc etc -- but how often is your person going to need
these skills afterwards?

The advantages of R are about resilience and integrity of data, and the
ability to guarantee the modelling in complex data environments. The DBMS
part gives you a sophisticated security model on top. In the end, most
databases are just lists of stuff and it hardly matters whether someone
is spelled P.A. James or PAJames or an unholy mixture. The people here
(including me) tend to use RDBMs because that's what we know: "to a man
with a hammer every problem looks like a nail". But to someone else, a
different tool altogether is often appropriate.

If you are going to justify spending this person's time and money, you
need to do it in terms of the requirements of his project. Is he going to
go bust/ to jail/ mad/ etc if he has a Spondiff without a valid
SpondiffSpecifier? If the answer is yes, then he needs a R database
designed (either by him or for him). If the answer is no, then there are
plenty of better or alternative approaches.

Just a thought....



Tim F
 
Nils said:
I'm trying to convince someone about the virtues of designing a proper
relational database with tables, relationships, referential integrity
etc. instead of dumping data into a flat-file Excel sheet.

Does anyone know of some good sources of info that goes into detail
comparing/contrasting how spreadsheets work vs. databases and the
pitfalls of spreadsheets for data storage.

I'm trying to create an argument for investing time into
analysis/normalization phase and show it's not wasted.

This person doesn't know much about relational databases.

Thanks in advance.

Who will be using this?
If it's an individual doing it for himself and he is comfortable with it
then Ferguson's ideas are sound.

However if reporting and having other people using it are a factor then
Access is a far better choice even if it maintained the flat file approach.

I've replaced Excel based systems run by an individual who gathered
information from the users with Access based ones run at the user level.
Many of the reports which were blood and tears for Excel were trivial in the
Access world.
 
Back
Top