Trying to normalize - assistance requested

H

Hilary Ostrov

Using Access 2000, I have an .xls (which is listing of 7600 Items)
I've imported, which I'm *trying* to normalize. But it seems I've
been chasing myself in query circles that go nowhere :(

Each record has the following fields:

Description
Number
ItemType
ID
ParentID
OrderNo

First the easy parts:

Description is self-explanatory

ItemType is either 5 or 0 (5 being readily identifiable as either a
Category, Sub-Category or - what I'm calling - "Group"; and 0 being
the actual item)

OrderNo is actually a sequence to indicate order of appearance
(crucial to retain)

Number is in one of two formats: nnnn or ~~~nnnn, the latter (of which
there are 857 in total) being the clue that the record is a Category,
SubCategory, or Group.

So far so good!

I can readily pull out Categories and SubCategories and, of course,
the actual Items. BUT the hierarchy is such that while the majority
of SubCategories have Groups many don't. The end result being that
*some* actual Items' Parents are SubCategories and other actual Items'
Parents are Groups. There's even a small number of Items whose Parent
is a Category.

I do need to retain this hierarchy for identification and searching
purposes (and order of appearance) but I can't seem to figure out an
effective (or efficient) way of splitting the original data so that
it's normalized. Given all of the above, it seems I have two options:

1) Accept the fact that there will be a number of items for which
there will be no group; or

2) Make some fake groups for the groupless by giving them the same
name/ID as the SubCategory.

Which is the lesser of the two evils in a situation like this - or is
there a third option that as a relative newbie I've been unable to
divine?!

Any guidance or helpful hints would be very much appreciated. Thanks.
hro
 
J

Jeff Boyce

Hilary

To normalize, step back from how you've been organizing your data and
consider how you'd explain the "things of interest" and "how they are
related" to someone who knows nothing about your subject area. Like my mom,
for example...

It sounds like, but I'm not real clear, that you have some kind of items
(?Item), some way of categorizing the items (?Type), and some way of
describing characteristics of the items (?category, subcategory, ??group),
and that these are somehow related to each other (?!?).

Your "OrderNo" sounds like a display sequencing (?sort order)
characteristic, but I'm not sure if that relates to your item, a category,
or ???

You didn't describe what use you're making of "ParentID".

Your description of "Number" left me confused -- are these actually numeric
values, that can be added and subtracted, or are they "codes"?

Can you provide a bit more explanation of the topic area? Could you provide
a small example of the kind of data?

By the way, Access will probably try to treat several of your field names as
reserved words, leading to confusing results. Consider renaming "Type" and
"Number" to start with!
 
H

Hilary Ostrov

Hilary

To normalize, step back from how you've been organizing your data and
consider how you'd explain the "things of interest" and "how they are
related" to someone who knows nothing about your subject area. Like my mom,
for example...

:) Sorry, Jeff ... Perhaps I should have slept on my original post
before I sent it!

Anyway, by way of background, the .xls data I'm trying to normalize
came from a .dbf file which had been used in another (non-Access
obviously!) application. The list is a list of electrical parts.

Category is the top level of the hierarchy. There are 22 Categories
and they must appear to the user (in any Reports) in a specific order
(hence the OrderNo field) - as must the SubCategories within
Categories, Groups within Sub-Categories, and Items within Groups (or
Items within SubCategories or - in a few instances - Items within
Category).

There are 6700+ distinct Item records. Of these, all but 800+ can be
shown as:

ItemNo Description Category SubCategory Group

And they can be displayed in ascending order of appearances within
their respective Category, SubCategory and Group.

The ParentID's of the other 800+ distinct Item records could
"translate" into *either* a Category or a SubCategory (although most
often a SubCategory) but never a Group! So these 800+ Item records
are the problem!
It sounds like, but I'm not real clear, that you have some kind of items
(?Item), some way of categorizing the items (?Type), and some way of
describing characteristics of the items (?category, subcategory, ??group),
and that these are somehow related to each other (?!?).

The "Type" is probably superfluous and only distinguishes between a
*real* Item (Type 0) or a Category, SubCategory or Group (all of which
are Type 5)
Your "OrderNo" sounds like a display sequencing (?sort order)
characteristic, but I'm not sure if that relates to your item, a category,
or ???

It is a display sequence indicator. The Categories must appear in a
specific order (non-alpha) as must any SubCategories (which have their
own non-alpha ordering within their respective Category). Likewise
for Groups within SubCategory and Items within Groups (or within
SubCategory or Category)
You didn't describe what use you're making of "ParentID".

I am using the ParentID to identify the respective Category and/or
SubCategory and/or Group.
Your description of "Number" left me confused -- are these actually numeric
values, that can be added and subtracted, or are they "codes"?

Sorry, they are "codes". The relevant ones are those that are part of
each Item record.
Can you provide a bit more explanation of the topic area? Could you provide
a small example of the kind of data?

Perhaps if you *see* the original data as well as what I'm trying to
accomplish, this will compensate for my inadequate descriptive
language :) If you wouldn't mind, perhaps you could take a look at:

www.myssiwyg.ca/Items.zip
By the way, Access will probably try to treat several of your field names as
reserved words, leading to confusing results. Consider renaming "Type" and
"Number" to start with!

I actually had done that already! In my first post, I had listed the
fieldnames as they appeared in the original .dbf/.xls

Thanks for your help.


hro
 
R

rkc

Hilary Ostrov said:
:) Sorry, Jeff ... Perhaps I should have slept on my original post
before I sent it!
<snip>

What about the original structure do you feel is not normalized?
 
J

Jeff Boyce

I have not visited the web link provided -- just as I don't download
attachments from someone not well known to me.

I wasn't suggesting your structure isn't normalized, I was asking for more
information to understand what your data structure is. Your original post
indicated that you weren't satisfied with the normalization.

Are we talking about the same thing?! I'm referring to the process of
identifying entities and relationships.

If I were to attempt to normalize the data you described in your response,
my first cut would be something like:

tlkpCategory
CategoryID
CategoryTitle
CategorySeqNumber

tlkpSubCategory
SubCategoryID
CategoryID
SubCategoryTitle
SubCategorySeqNumber

tlkpGroup
GroupID
CategoryID
SubCategoryID
GroupTitle
GroupSeqNumber

tblItem
ItemID
ItemDescription
CategoryID
SubCategoryID
GroupID

I'm still not clear on what (Code)Number represents, or how you use
ParentID.

Am I still not 'getting it'?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top