There is a new page on the Burns Statistics website
http://www.burns-stat.com/pages/Tutor/spreadsheet_addiction.html
that looks at spreadsheets from a quality assurance perspective.
....
Two general points. First, spreadsheets are often the only programmable
system with GUI interface most business users are given (most are given
Windows Scrioting with VBScript and VJScript bundled with Windows, but few
are aware of having it, and it provides no GUI IDE). Second, I agree that
spreadsheet models that take more than a few minutes at most to recalc are
too complex to remain as spreadsheet models, *but* it's usually not the
developer or maintainer of such systems that decide to leave them as
spreadsheet models.
The first bears immediately on your technology acceptance model. If XL is
believed to be the only tool some person has to develop any computationally
intensive models, then that person will use XL because by it's exclusive
availability it automatically satisfies your technology acceptance model.
It's the lack of alternatives for most people who've been given develoment
tasks that explains the overuse of spreadsheets.
Details
Spreadsheet Computation
- DATA CHECKING
This is debatable. There are other languages that automatically treat
strings as numeric zeros in numeric calculations (most scripting languages).
This is a design trade-off between convenience and type checking rigor.
Given the explosive growth of scripting languages (as well as spreadsheets)
compared to strongly typed languages over the last two decades, it seems
very clear that developers have placed a higher value on convenience than
rigorous type checking. Why should spreadsheets differ?
Granted there are risks involved with this, but shouldn't adult developers
be allowed to decide whether the advantages more than offset the risks?
Also, Excel 2003's SmartTags provide some data checking functionality.
- DATA STRUCTURE
Your claim that "Complex data [...] demand a convention for placement of the
components." is to vague and too general. There may be structures that allow
for more efficient access during calculations, but it just doesn't matter
whether one variable in a given calculation is placed to the left or right,
above of below another in a worksheet. Related data should be stored close
together, and data layout used should allow formulas to be as simple and
efficient as possible, but that's all.
- COMMAND HISTORY
If you mean recording every action, then such histories could grow much
larger than the workbook itself. That's a nonstarter. If you mean there
should be facilities to record formulas, formatting, VBA code and forms in
plain text that could be used in revision control systems as is standard
software engineering practice with more traditional programming languages,
then you've got a point. However, that's not going to happen until companies
decide to approach spreadsheet development the way they approach other
software development - as an endevor requiring disciplined procedures.
The Treatment Center (Alternatives)
- COMPUTATION
I like R, and I use it regularly, but you're crazy if you believe it could
be used efficiently by most current Excel users. First off, it uses a
command line interface. That actually gives it greater power than Excel, but
at the cost of considerable complexity. I've worked with people who believe
the same thing about APL. They're right that spreadsheets are inefficient,
but wrong in believing that spreadsheets are crippled. Also, undisciplined
use of R will produce as many problems as undisciplined use of spreadsheets.
Unless companies are willing to take *ALL* application development out of
the hand of people with no formal training (school coursework or in-house)
in software development, companies are stuck with spreadsheets as the least
of many evils.
Specific Problems with Excel
Excel's current dominance owes as much to what Lotus Development Corp didn't
do as to what Microsoft did do. Lotus was so immersed in the look & feel
lawsuits of the late 1980s that they misapprehended the importance of
Windows. What killed 123 wasn't that it was inferior to Excel (IMO, 123
Release 5 with only the classic macro facility was still better than Excel
5/95 with VBA, but a close contest), it was that so-called productivity
applications became bundled in suites, and Word was MUCH better than AmiPro.
[In retrospect, Lotus's single biggest mistake was in failing to pursue
acquisition of WordPerfect and settling for AmiPro instead. Of course, no
major upgrades to 123 between 1986 and 1989 didn't help either.]
- WRITING ASCII FILES
Agreed, but your criticism is too narrow. Excel also mangles some types of
user inputs, being far too ready to convert entries into dates. There should
be a way to turn off Excel's 'helpful' features.
- PROPAGATION OF BLANKS
Excel has always skipped blanks (and cells containing text and booleans)
when they were accessed within ranges. So if A1 contained 1 and A3 contained
5 while A2 was blank and A4 contained "xyz", =SUM(A1:A4) has always returned
6, and =AVERAGE(A1:A4) has always returned 3. Excel also has never provided
a mechanism to propagate blanks. Using A1:A4 as above, Excel must return
something for the formula =A2, and that something must be either a number, a
text string, a boolean value or an error value. Excel provides no value that
equals blank (though returning Empty from udfs does return a value Excel
will treat as blank, Excel itself has no means of generating such a value as
a formula result). Excel's convention, reasonable for a program intended
primarily for calculation, is to return 0 for =A2.
This limitation isn't unique to Excel. 123 and QuattroPro do the same thing.
- RANGE CHANGES
No big deal. Anyone with real spreadsheet development experience knows to
include blank cells bordering data ranges in order to prevent this. For
example, if I want to use data in C3:H22 that could later vary in the number
of rows or columns, I'll refer to B2:I23, keeping columns B and I and rows 2
and 23 blank or containing FALSE with number format ";;;""-""" and Fill
horizontal alignment. Then when I insert or delete rows or columns inside or
bordering C3:H22, the containment range B2:I23 will adapt as needed.
Needless to say, I disable range extension. Avoiding this comes with
experience, just as other common bugs in other languages are a sign of
inexperience.
- TYPING MISTAKES
If you want to trap 'em, use Data > Validation. Excel defaults to free form
input, so the contents of any cell need not be the same type as the values
in any adjacent cell. If you want to impose such restrictions, Excel
provides you the tools necessary to change its default behavior. Only
unintended date interpretation is a problem, and accidentally entering
slashes can be nearly eliminated by entering numbers using the numeric
keypad. USE THE RIGHT TOOLS FOR THE TASK, which in this case means USE THE
RIGHT SET OF NUMBER KEYS.
- UNARY MINUS
As already pointed out, with 3 in A1, the formula =-A1^2 returns 9 as does
the formula =-3^2 (and as does =-(3)^2). Your claim that the two formulas
return different results is wrong. This is due to nonstandard operator
precedence, but once established (and extant for over two decades now) it
can't be changed without causing errors in existing workbooks.
You need to rewrite this entire section to correct all the errors. [Very
poorly done job here. Did you really check any of the formulas in Excel?]
- BAD GRAPHICS
Not unique to Excel. No sensible person uses Excel charts for anything other
than presentations, which are a refined form of lying.
- POOR STATISTICS
As you mention yourself, you shouldn't use Excel for real statistics work.
While the continuous distributions and regression functions have been
improved considerably in Excel 2003, the discreet distributions are still
pretty poor.
- UNHELPFUL HELP
DEFINITELY! Microsoft has invested next to nothing on correcting long
standing errors in Excel's online help, why should they invest anything on
added help topics or expanded examples?
- NO DATABASE OF BUGS
What do you expect from a company that requires users to pay initially for
the privilege of reporting bugs in their software?
- LIMITED DIMENSIONS
Of the major commercial spreadsheets, only QuattroPro provides >256 columns
and >65536 rows, and it's far & away the buggiest of them. If you want more
dimensions, use Xess (a Unix spreadsheet that has a Windows version) or wait
until Gnumeric 1.4's Windows port is more stable, then tweak its sources and
rebuild to get more columns and/or rows.