500mb+ DB - Help!

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

I have a test data DB with data from 4 searate part testers. They all have
fields with the date, the time, which tester it is, and whether it passed or
not. The main tester just also does ~100 other tests and outputs that data.
I therefore get 4 spreadsheets everyday. Because of that I went ahead and
combined all the data into in one table so when I analyzed it I could get for
instance the total amount of failures for the day/week/month. Of course this
leaves 3/4 of the table being blank. Doing this with one month is fine, the
database is only about 40mb and it doesnt take too long to bring up various
graphs to look at the data. Now that I want to look at the data from the
past year, with a pivot chart it takes 30mins+ with every step meaning to get
the chart I want would probably take over 2 hours! Even though there are
about 100 tests Im only usually graphing 1 yet it looks like my computer is
trying to squeeze the entire db into memory.

At first I tried separating tables by months and doing union querys to
combine months if I wanted, but that was a litte tedious. Ive also
compacted/repaired.

Is there anything I can do to get things to go a little quicker?
 
TWilson said:
I therefore get 4 spreadsheets everyday. Because of that I went ahead and
combined all the data into in one table so when I analyzed it I could get for
instance the total amount of failures for the day/week/month. Of course this
leaves 3/4 of the table being blank.

That means you put the columns side by side instead of appending new rows
into the existing table structure where there's only *one* column each for
date, time, tester and pass/fail, not four of each. You can use the import
wizard to append the rows of each subsequent spreadsheet after you import the
first one to get the table structure. You'll find your queries run much
faster with a normalized table structure.
 
No, I made sure there is only one column for each type of data they share. I
use a macro in excel that combines the data before I import it to access. My
headings are basically: Date - Time - Tester - Pass/Fail - Tests (~100 of
which are only filled by the first tester).
 
ok...I'll ask the obvious question...have you compacted the db?
Tools>>Database Utilities>>Compact Database
 
TWilson said:
No, I made sure there is only one column for each type of data they share.

I'm confused. Your description of *combining all the data* of four
spreadsheets leaving *3/4 of the table blank* suggests only 1/4 of the
columns have data for any given row. If you're saying that's not the case I
must have misunderstood you because I can't really picture 3/4 of the rows
being blank when the columns are normalized. What is making the table 3/4
blank after you've combined the four spreadsheets into one?

If your columns are already normalized put indexes on relevant columns to
make queries run as fast as possible.
 
No, I made sure there is only one column for each type of data they share. I
use a macro in excel that combines the data before I import it to access. My
headings are basically: Date - Time - Tester - Pass/Fail - Tests (~100 of
which are only filled by the first tester).

Do you mean that you have ~104 Fields in this table - one field for
each test!? If so you're still emphatically NOT normalized.

John W. Vinson[MVP]
 
Ahhhh, I see. Something else I need to learn to do. Im sorry, Im very new
to access.

How do you guys reccomend I should go about splitting up the table to
normalize it?
 
Ahhhh, I see. Something else I need to learn to do. Im sorry, Im very new
to access.

How do you guys reccomend I should go about splitting up the table to
normalize it?

Three tables.

Runs
RunID
RunTime <date and time in one field, don't use two>
Tester
<other info about the test as a whole>

Tests
TestID
TestDescription <what's now your fieldnames>

Results
RunID <link to Runs, who ran the test when
TestID <what they ran
Result < the result of this test

This gives you "tall-thin" tables; if some tests weren't run you'll
simply have no records in the table at all for that test.

You *MIGHT* want a pass/fail field in the Runs table, but strictly
speaking it's derived data - I don't know what constitutes a pass or a
failure so I'm not sure how you'ld derive it.

John W. Vinson[MVP]
 
Ok. Couple questions.

About the pass/fail stuff, I neglected to mention a detail about that so
things looked simpler. The testers actually report a "status", with pass or
a failure code. There's a failure code depending upon which test the part
failed for. I guess Ill also add that once a part fails, no other tests are
done and the failure code for that certain test is reported. I also have a
list that identifies each failure code. So would I be right to add another
table:

Status
Status ID (failure code)
Failure Name,

then add another field to the "Runs" table for the Status ID?



From before you have a general idea of what my first table looked like.
That is also pretty much the format the testers give me the data. How do I
easily put all this data into their respective tables?
 
About the pass/fail stuff, I neglected to mention a detail about that so
things looked simpler. The testers actually report a "status", with pass or
a failure code. There's a failure code depending upon which test the part
failed for. I guess Ill also add that once a part fails, no other tests are
done and the failure code for that certain test is reported. I also have a
list that identifies each failure code. So would I be right to add another
table:

Status
Status ID (failure code)
Failure Name,

then add another field to the "Runs" table for the Status ID?

You've got it!! Exactly.
From before you have a general idea of what my first table looked like.
That is also pretty much the format the testers give me the data. How do I
easily put all this data into their respective tables?

A "Normalizing Union Query" can do this. If your incoming data
(spreadsheet?) has

Date - Time - Tester - Pass/Fail - Tests (~100 of
which are only filled by the first tester).

then you could create a Union Query:

SELECT [Date] + [Time] AS TestDateTime, Tester, Test1 FROM Wideflat
WHERE Test1 IS NOT NULL
UNION ALL
SELECT [Date] + [Time], Tester, Test2 FROM WideFlat
WHERE Test2 IS NOT NULL
UNION ALL
<etc. etc.>

This union query will "unravel" the spreadsheet into a tall-thin
table; you can then base one or more Append queries on the Union query
to populate your normalized tables.

John W. Vinson[MVP]
 
Ok, Im a little lost. Very new to access remember.

I assume Im supposed to import that wide spreadsheet and then run that query
on it? Well, thats what I did, but then under the "test1" column, the name
of the first test is displayed then all the data from the other tests goes
under it. How will it know where the different tests are in the column?

Then say I did get that working. Can you give me a little help, maybe an
example of what the append query would look like? How would I go about
assigning a key to the tables? I also assume all these "tables" will
actually be queries?

John Vinson said:
From before you have a general idea of what my first table looked like.
That is also pretty much the format the testers give me the data. How do I
easily put all this data into their respective tables?

A "Normalizing Union Query" can do this. If your incoming data
(spreadsheet?) has

Date - Time - Tester - Pass/Fail - Tests (~100 of
which are only filled by the first tester).

then you could create a Union Query:

SELECT [Date] + [Time] AS TestDateTime, Tester, Test1 FROM Wideflat
WHERE Test1 IS NOT NULL
UNION ALL
SELECT [Date] + [Time], Tester, Test2 FROM WideFlat
WHERE Test2 IS NOT NULL
UNION ALL
<etc. etc.>

This union query will "unravel" the spreadsheet into a tall-thin
table; you can then base one or more Append queries on the Union query
to populate your normalized tables.

John W. Vinson[MVP]
 
Ok, Im a little lost. Very new to access remember.

I assume Im supposed to import that wide spreadsheet and then run that query
on it? Well, thats what I did, but then under the "test1" column, the name
of the first test is displayed then all the data from the other tests goes
under it. How will it know where the different tests are in the column?

Then say I did get that working. Can you give me a little help, maybe an
example of what the append query would look like? How would I go about
assigning a key to the tables? I also assume all these "tables" will
actually be queries?

Ok, I guess I misinterpreted the data.

Please post the fieldnames of your (linked or imported) spreadsheet
table, perhaps with a row or two of sample data.

I do not understand your last question at all. I was not suggesting
multiple "tables".

John W. Vinson[MVP]
 
Sorry, they were actually two completely separate questions. I do understand
that Im gonna start out with a single tall thin table. I just also need help
after that. Lets take things a step at a time then.

The data from the 4 testers is actually given to me in 4 separate
spreadsheets. Since tester 1 is the only one that does all those separate
tests, its ok to combine them into one table.

Here's an example with actual table field names and data. With the 1st part
it passed tester 1, but failed tester 2.

Date and Time - Fixture - Status Code - Part - Pallet ID - Test1 - Test2...etc
1/1/2006 8:00am - Fixture 1 - Pass - 1 - 1 - 10.5 -
0.05...etc
1/1/2006 8:05am - Fixture 2 - 1000 - 1 - 1 - -
1/1/2006 8:10am - Fixture 1 - Pass - 1 - 2 - 10.4 -
0.06...etc


Heres some field info relating to terms we were previously using, along with
some new ones:
There is actually a Date and Time field - I was using text to columns to
split it up, so we can just use that one.
Fixture = Tester
Status code = failure code
Part - We actually run about 3 different parts. They're all tested the same
though.
Pallet ID - The ID number of the pallet the part sits on as it goes down the
assembly line
Test1-~Test100 - They all have their own separate names, we can leave it at
test1, test2...etc for now.

I really appreciate your help BTW :)
 
you shouldn't be using MDB format for anything in the year 2006.

it is completely obsolete.

I reccomend using SQL Server; or mySql.
 
Sorry, they were actually two completely separate questions. I do understand
that Im gonna start out with a single tall thin table. I just also need help
after that. Lets take things a step at a time then.

The data from the 4 testers is actually given to me in 4 separate
spreadsheets. Since tester 1 is the only one that does all those separate
tests, its ok to combine them into one table.

Here's an example with actual table field names and data. With the 1st part
it passed tester 1, but failed tester 2.

Date and Time - Fixture - Status Code - Part - Pallet ID - Test1 - Test2...etc
1/1/2006 8:00am - Fixture 1 - Pass - 1 - 1 - 10.5 -
0.05...etc
1/1/2006 8:05am - Fixture 2 - 1000 - 1 - 1 - -
1/1/2006 8:10am - Fixture 1 - Pass - 1 - 2 - 10.4 -
0.06...etc
Heres some field info relating to terms we were previously using, along with
some new ones:
There is actually a Date and Time field - I was using text to columns to
split it up, so we can just use that one.
Fixture = Tester
Status code = failure code
Part - We actually run about 3 different parts. They're all tested the same
though.
Pallet ID - The ID number of the pallet the part sits on as it goes down the
assembly line
Test1-~Test100 - They all have their own separate names, we can leave it at
test1, test2...etc for now.

So you want to migrate the data from these spreadsheets into a
tall-thin table (that's what I'd recommend anyway!)?

What's the structure of the tall-thin? What's its Primary Key?

Assuming that you have the normalized table set up as follows:

TestDateTime
Fixture
StatusCode
PartID
PalletID
TestCode
TestResult

then you could create a UNION query based on your spreadsheet:

SELECT [Date and Tim], [Fixture], [Status Code], [Part ID], [Pallet
ID], "Test1" AS TestCode, [Test1] As TestResult
FROM [yourspreadsheet]
WHERE [Test1] IS NOT NULL
UNION
SELECT [Date and Tim], [Fixture], [Status Code], [Part ID], [Pallet
ID], "Test2" AS TestCode, [Test2] As TestResult
FROM [yourspreadsheet]
WHERE [Test2] IS NOT NULL
UNION

<etc etc>

You probably will need two or more queries, 100 columns is a lot and
will be liable to the dreaded Query Too Complex error.

You can then base Append queries on these UNION queries to populate
your tall thin table.

John W. Vinson[MVP]
 
oh so John Vinson finally admits that MDB queries randomly CRAP OUT?

that is 40% of the reason that MDB sucks balls-- every once in a while
things crap out; not telling you any real debugging information-- just
giving up and throwing a tissy fit.

I find it humourous that you pansies still use MDB.
it is for pansies; wimps and retards that don't have the capacity to
learn SQL Server.

SQL Express / MSDE is JUST AS FREE as MDB.

MDB is stupid.
Spit on anyone that uses it for anything






John said:
Sorry, they were actually two completely separate questions. I do understand
that Im gonna start out with a single tall thin table. I just also need help
after that. Lets take things a step at a time then.

The data from the 4 testers is actually given to me in 4 separate
spreadsheets. Since tester 1 is the only one that does all those separate
tests, its ok to combine them into one table.

Here's an example with actual table field names and data. With the 1st part
it passed tester 1, but failed tester 2.

Date and Time - Fixture - Status Code - Part - Pallet ID - Test1 - Test2...etc
1/1/2006 8:00am - Fixture 1 - Pass - 1 - 1 - 10.5 -
0.05...etc
1/1/2006 8:05am - Fixture 2 - 1000 - 1 - 1 - -
1/1/2006 8:10am - Fixture 1 - Pass - 1 - 2 - 10.4 -
0.06...etc
Heres some field info relating to terms we were previously using, along with
some new ones:
There is actually a Date and Time field - I was using text to columns to
split it up, so we can just use that one.
Fixture = Tester
Status code = failure code
Part - We actually run about 3 different parts. They're all tested the same
though.
Pallet ID - The ID number of the pallet the part sits on as it goes down the
assembly line
Test1-~Test100 - They all have their own separate names, we can leave it at
test1, test2...etc for now.

So you want to migrate the data from these spreadsheets into a
tall-thin table (that's what I'd recommend anyway!)?

What's the structure of the tall-thin? What's its Primary Key?

Assuming that you have the normalized table set up as follows:

TestDateTime
Fixture
StatusCode
PartID
PalletID
TestCode
TestResult

then you could create a UNION query based on your spreadsheet:

SELECT [Date and Tim], [Fixture], [Status Code], [Part ID], [Pallet
ID], "Test1" AS TestCode, [Test1] As TestResult
FROM [yourspreadsheet]
WHERE [Test1] IS NOT NULL
UNION
SELECT [Date and Tim], [Fixture], [Status Code], [Part ID], [Pallet
ID], "Test2" AS TestCode, [Test2] As TestResult
FROM [yourspreadsheet]
WHERE [Test2] IS NOT NULL
UNION

<etc etc>

You probably will need two or more queries, 100 columns is a lot and
will be liable to the dreaded Query Too Complex error.

You can then base Append queries on these UNION queries to populate
your tall thin table.

John W. Vinson[MVP]
 
The more I learn about this the more I become concerned about different things.

Are you sure an append query is what I want to be using to combine the 3
separate queries I made (too avoid the too complex error)?

Will all of this update automatically seeing as I will be importing new data
on a daily basis?
 
The more I learn about this the more I become concerned about different things.

Are you sure an append query is what I want to be using to combine the 3
separate queries I made (too avoid the too complex error)?

I have no way of knowing WHAT YOU WANT. My telepathy is on the blink.

Three append queries will move the data from your spreadsheets into a
new table.
Will all of this update automatically seeing as I will be importing new data
on a daily basis?

No, of course it will NOT do so. An append query runs, copies its
data, and then it's done.

If you want to store your data in three spreadsheets, and have any
updates to those spreadsheets instantly reflected in your Access
application, then an append query is *not* the appropriate tool.

John W. Vinson[MVP]
 
I want to do what you reccomended for me to do.

Just to be sure we're on the same page, here's what I have so far.

1. I get 4 separate excel spreadsheets of data from the 4 testers.
2. I combine them into 1 excel spreadsheet. (Im now done with excel)
3. I import that excel spreadsheet into Access into a flat wide table with a
test per field.
4. I created 3 queries (to avoid the query too complex error) which
converted and separated the wide table into 3 tall thin queries putting all
the tests and test results into thier own single fields. (Instead of ~107
fields, Im down to 7)

Now how do I combine the 3 queries into a single table?

Also, please keep in mind I will be importing new data to this db daily. I
would also like everything to automatically go where it needs to after I
import the spreadsheet. When this is all set up I would like for it to be -
1. Import excel spreadsheet 2. I can now analyze the data in line/bar graphs
(eg. - test data vs some time period). Does this completely change the way I
have been setting this up?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top