Would normalization improve or hinder response time?

J

jacksonmacd

This is about an application developed by another division of the
company where I work, so I don't really hold any influence over the
database structure. However, I have provided my opinion on occasion --
but it is generally ignored. Go figure...

It's about a table containing many (up to millions) of records of time
intervals. One attribute of the table is the StopCode -- a code number
to explain what happened during the time interval. StopCodes are
related one-to-many to StopCategories. The number of StopCodes would
be in the 10s to 100s, and the number of StopCategories would be less
than 10. So far, so good.

Howeverr, the table is flattened -- it has columns for both StopCode
and StopCategory. The developers/custodians believe that the violation
of Normalization rules is justified because it will "improve
performance because the reporting engine does not need to look up the
StopCategory for every record."

I believe that's just bunk! however, I can *sorta* see their point --
the lookup is done once, and then forever stored in the table. With
normalization, a join needs to be performed every time the database is
queried. Makes some kind of sense...

Then it occured to me that the database engine would (probably?)
retrieve the StopCategories once during any query, and retrieve one
column from the table, and do the join in memory. But with the
flattened structure, it would need to retrieve two columns from the
table for every record. Seems to be eliminating a join but increasing
the amount of disk activity. Seems to be trading an in-memory activity
with an on-disk activity, which *must* be slower.

Does my logic make sense? Proper/appropriate indexing would obviously
be required, and YMMV depending of any number of factors, but I am
interested in the principle. Does the same answer apply whether the
database is Jet or SQL Server? I would like to be in a position of
knowledge to counter their "joins are too slow" arguement.

Thanks.
 
A

Armen Stein

Howeverr, the table is flattened -- it has columns for both StopCode
and StopCategory. The developers/custodians believe that the violation
of Normalization rules is justified because it will "improve
performance because the reporting engine does not need to look up the
StopCategory for every record."
I believe that's just bunk! however, I can *sorta* see their point --
the lookup is done once, and then forever stored in the table. With
normalization, a join needs to be performed every time the database is
queried. Makes some kind of sense...

It could improve times just a bit. A join to a lookup table in itself
isn't that costly, but in your case you may be doing it for millions
of records. I suppose it would depend also how complex the rest of
the query is, especially if StopCategory is also joined to other
tables.

The main concern I would have is that a StopCode might be reassigned
to a different StopCategory. Do they have triggers in place to update
millions of records with the new value? Or is this data frozen once
it is generated?

We've denormalized for performance before, but never for a simple
lookup. It's usually for an aggregate of some kind (e.g. the latest
date, an order total.), that we've proven will cause unacceptable
performance. And we make sure that triggers keep the denormalized
value current.
Does my logic make sense? Proper/appropriate indexing would obviously
be required, and YMMV depending of any number of factors, but I am
interested in the principle. Does the same answer apply whether the
database is Jet or SQL Server? I would like to be in a position of
knowledge to counter their "joins are too slow" arguement.

Well, I wouldn't likely store millions of records in Jet to begin
with. But again, in either Jet or SQL Server I don't think it makes
sense to denormalize a simple lookup unless specific testing has been
performed to prove that the normalized structure is too slow.

Armen Stein
Microsoft Access MVP
www.JStreetTech.com
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top