Processing thousands of records

D

Dave

I have a query that someone had helped me with and it works great: However
when I run this against 91,000 records the query runs forever.

When running queries against thousands of records, is there a tricks to do
this effectively. Obviously the process and memory limits calculations, but
are there tricks to handling a lot of data?

Thanks

Dave
 
J

Jerry Whittle

How long is 'forever' in this case?

The first thing that affect performance is how well the database is
designed. Are the tables properly normalized?

Next could be the query itself. There are a few tricks of the trade like
trying an Exists clause instead of an In clause. The flip side is that
sometimes the In will work better than the Exists so you need to try it both
ways. Then there is differences between UNION and UNION ALL queries. The same
goes for DISTINCT versus GROUP BY all the fields.

Then there's indexing. Do you have indexes on the fields where you have
criteria?

Show us the SQL. Open the query in design view. Next go to View, SQL View
and copy and past it here. Information on primary keys and relationships
would be a nice touch too. Also any indexes.
 
A

Allen Browne

Dave said:
have a query that someone had helped me with and it works great:
However when I run this against 91,000 records the query runs forever.

When running queries against thousands of records, is there a tricks to
do this effectively. Obviously the process and memory limits
calculations,
but are there tricks to handling a lot of data?

That's a huge question. For an in-depth response, get "SQL Queries for Mere
Mortals" by John Viescas and Michael Hernandez:
http://safari.oreilly.com/9780321444431

For a basic answer:
a) Normalized tables.
b) Indexes
c) Craft your criteria to use the indexes
d) Use efficient joins
e) Avoid unnecessary function calls

For a bit more detail, see:
Common query hurdles
at:
http://allenbrowne.com/QueryPerfIssue.html
 
D

Dave

Thanks for the quick response.

Here is the query and I do not have the data indexed. I am not sure which
fields I need to index.

After 60 minutes I shut the app down. Processing was not completed. There
are 47,000 records.

SELECT T.[Item Number], T.[Serial Number], T.[Resource ID], T.[Task Number],
(SELECT Max([Labor End Date]) as PrevDate FROM 1stTimeFix
WHERE [Item Number] = T.[Item Number] AND [Serial Number] = T.[Serial Number]
AND [Task Number] < T.[Task Number]) AS PrevDate, T.[Labor Start
Date], T.[Labor End Date] INTO 1stVisit02
FROM 1stTimeFix AS T;
 
D

Dave

I am very weak with indexing. What does normalize tables mean?

Is there a way in incorporate an image in a post?

Dave
 
D

Dave

I took a shot at this and indexed Item Number, Serial Number and Task Number.

I got a return in 1 minute.

Soooo.... where do I read about fundamental indexing and normalization?

Thanks

Dave
 
M

Michel Walsh

Depends on the query. Sometimes, you may have to rewrite the query to use
something that has a higher starting time, but which does not degrade as
fast. As example, ranking can be done in many different ways, but for a very
large number of records, with Jet, it may be preferable to APPEND the
ORDERed record into a temporary table set with an autonumber field: it is
longer, for you, to write the required code (check if the temp table exist
and delete it if so, then create the temp table, then run the append query,
then look at the result into the temp table, checking the autonumber field
supplying the ranks), but the overall execution time can become a fraction
of the time you would get with a sub-query, or with a non-equi join, for a
very large number of records that is...


Vanderghast, Access MVP
 
B

Bob Barrows [MVP]

Dave said:
I have a query that someone had helped me with and it works great:
However when I run this against 91,000 records the query runs forever.

When running queries against thousands of records, is there a tricks
to do this effectively. Obviously the process and memory limits
calculations, but are there tricks to handling a lot of data?

Thanks
As a short cut, you can run your query through the Access Optimizer and
see what it suggests.

Of course, automated tools like that are only a starting point that
cannot substitute for doing the research about the concepts mentioned by
John and Allen. Sometimes the answer provided by the optimizer is not
appropriate, and only knowledge can help you distinguish the good
suggestions from the bad.
 
J

Jerry Whittle

Great!

If you lived near St. Louis, you could take the college course that I teach.
;-)

Allen mentioned one book by Hernandez. To that I'll add "Database Design for
Mere Mortals" also by Michael Hernandez for basic database design and
normalization.

http://safari.oreilly.com/0201752840

As for indexes, every table should have a primary key. Hopefully it is only
one field. Access automatically creates an index for primary key fields.

If your queries join two or more tables, go to the Relationships window and
see if a relationship is defined between the tables involved. Hopefully they
are and Referential Integrity is enables as this will also create an index
with the Foreign Key in the other table. If not, see if you can link the
tables and turn on RI. There's a very good chance that it can not. In that
case mark sure that the linked fields in both tables have indexes.

Also index any field that is part of the Where Clause or Order By.

Warning: Indexes aren't free. They also take up space and can slow down
inserts and updates. Don't go overboard on the indexing. As mentioned before
Access will automatically create an index for Primary Key fields. It will
also create indexes for field with any of the following in any part of the
field names: ID; key; code; num. In fact Access has no problem creating two
or more indexes on the same field. For example if your Primary Key is named
PK_ID, you will have two indexes for the same field which is wasteful.

Now for your query: INTO 1stVisit02

Often inserting records into another table is a sign of problems. Normally
you just want to grab the data from the primary tables as needed. Maybe now
that the query is working faster, you don't need the 1stVisit02 query. Moving
records from one table to another is often a mistake and a sign that the data
isn't normalized properly. An exception is when importing data from other
files and programs.
 
D

Dave

I notice in other Forums there are FAQs, and useful threads bookmarked. I
know MS had KBs but I find them difficult to follow without a lot of examples.

Too bad there are not bookmarked FAQs here.

=======================================

I have received a great deal of reference material and help in this one post.

I hope others that struggle with similar issues can find this thread as
useful as I do.

Bob I had not even thought of the Optimizer.

Thanks Dave
 
J

John Spencer

In addition to all the suggestions you got here. I posted an alternate query
that may (or may not) be faster in your previous thread.

You might take a look at it and see if you can adapt it.

You definitely need indexes on Item Number, serial Number and task number. If
you try to use my posted query, you should also index Labor End Date.


John Spencer
Access MVP 2002-2005, 2007-2008
The Hilltop Institute
University of Maryland Baltimore County
Thanks for the quick response.

Here is the query and I do not have the data indexed. I am not sure which
fields I need to index.

After 60 minutes I shut the app down. Processing was not completed. There
are 47,000 records.

SELECT T.[Item Number], T.[Serial Number], T.[Resource ID], T.[Task Number],
(SELECT Max([Labor End Date]) as PrevDate FROM 1stTimeFix
WHERE [Item Number] = T.[Item Number] AND [Serial Number] = T.[Serial Number]
AND [Task Number] < T.[Task Number]) AS PrevDate, T.[Labor Start
Date], T.[Labor End Date] INTO 1stVisit02
FROM 1stTimeFix AS T;




Jerry Whittle said:
How long is 'forever' in this case?

The first thing that affect performance is how well the database is
designed. Are the tables properly normalized?

Next could be the query itself. There are a few tricks of the trade like
trying an Exists clause instead of an In clause. The flip side is that
sometimes the In will work better than the Exists so you need to try it both
ways. Then there is differences between UNION and UNION ALL queries. The same
goes for DISTINCT versus GROUP BY all the fields.

Then there's indexing. Do you have indexes on the fields where you have
criteria?

Show us the SQL. Open the query in design view. Next go to View, SQL View
and copy and past it here. Information on primary keys and relationships
would be a nice touch too. Also any indexes.
 
J

Jerry Whittle

I teach at SWIC over in Belleville. It's at the College Metrolink station.
Class starts on Monday, August 25th evening. A few seats still open last I
checked.
 
D

Dave

John did you post the query?

Dave

John Spencer said:
In addition to all the suggestions you got here. I posted an alternate query
that may (or may not) be faster in your previous thread.

You might take a look at it and see if you can adapt it.

You definitely need indexes on Item Number, serial Number and task number. If
you try to use my posted query, you should also index Labor End Date.


John Spencer
Access MVP 2002-2005, 2007-2008
The Hilltop Institute
University of Maryland Baltimore County
Thanks for the quick response.

Here is the query and I do not have the data indexed. I am not sure which
fields I need to index.

After 60 minutes I shut the app down. Processing was not completed. There
are 47,000 records.

SELECT T.[Item Number], T.[Serial Number], T.[Resource ID], T.[Task Number],
(SELECT Max([Labor End Date]) as PrevDate FROM 1stTimeFix
WHERE [Item Number] = T.[Item Number] AND [Serial Number] = T.[Serial Number]
AND [Task Number] < T.[Task Number]) AS PrevDate, T.[Labor Start
Date], T.[Labor End Date] INTO 1stVisit02
FROM 1stTimeFix AS T;




Jerry Whittle said:
How long is 'forever' in this case?

The first thing that affect performance is how well the database is
designed. Are the tables properly normalized?

Next could be the query itself. There are a few tricks of the trade like
trying an Exists clause instead of an In clause. The flip side is that
sometimes the In will work better than the Exists so you need to try it both
ways. Then there is differences between UNION and UNION ALL queries. The same
goes for DISTINCT versus GROUP BY all the fields.

Then there's indexing. Do you have indexes on the fields where you have
criteria?

Show us the SQL. Open the query in design view. Next go to View, SQL View
and copy and past it here. Information on primary keys and relationships
would be a nice touch too. Also any indexes.
--
Jerry Whittle, Microsoft Access MVP
Light. Strong. Cheap. Pick two. Keith Bontrager - Bicycle Builder.


:

I have a query that someone had helped me with and it works great: However
when I run this against 91,000 records the query runs forever.

When running queries against thousands of records, is there a tricks to do
this effectively. Obviously the process and memory limits calculations, but
are there tricks to handling a lot of data?

Thanks

Dave
 
J

John Spencer

It should be in response to your earlier message.

What I posted there was

SELECT T.Item, T.TransactionID
, T.TransDate
, Max(T2.TransDate) as PriorDate
, DateDiff("D",Max(T2.TransDate),T.TransDate) as Elapsed
FROM YourTable as T LEFT JOIN YourTable as T2
ON T.Item = T2.Item
AND T.TransDate > T2.TransDate
GROUP BY T.Item, T.TransactionID, T.TransDate


John Spencer
Access MVP 2002-2005, 2007-2008
The Hilltop Institute
University of Maryland Baltimore County
John did you post the query?

Dave

John Spencer said:
In addition to all the suggestions you got here. I posted an alternate query
that may (or may not) be faster in your previous thread.

You might take a look at it and see if you can adapt it.

You definitely need indexes on Item Number, serial Number and task number. If
you try to use my posted query, you should also index Labor End Date.


John Spencer
Access MVP 2002-2005, 2007-2008
The Hilltop Institute
University of Maryland Baltimore County
Thanks for the quick response.

Here is the query and I do not have the data indexed. I am not sure which
fields I need to index.

After 60 minutes I shut the app down. Processing was not completed. There
are 47,000 records.

SELECT T.[Item Number], T.[Serial Number], T.[Resource ID], T.[Task Number],
(SELECT Max([Labor End Date]) as PrevDate FROM 1stTimeFix
WHERE [Item Number] = T.[Item Number] AND [Serial Number] = T.[Serial Number]
AND [Task Number] < T.[Task Number]) AS PrevDate, T.[Labor Start
Date], T.[Labor End Date] INTO 1stVisit02
FROM 1stTimeFix AS T;




:

How long is 'forever' in this case?

The first thing that affect performance is how well the database is
designed. Are the tables properly normalized?

Next could be the query itself. There are a few tricks of the trade like
trying an Exists clause instead of an In clause. The flip side is that
sometimes the In will work better than the Exists so you need to try it both
ways. Then there is differences between UNION and UNION ALL queries. The same
goes for DISTINCT versus GROUP BY all the fields.

Then there's indexing. Do you have indexes on the fields where you have
criteria?

Show us the SQL. Open the query in design view. Next go to View, SQL View
and copy and past it here. Information on primary keys and relationships
would be a nice touch too. Also any indexes.
--
Jerry Whittle, Microsoft Access MVP
Light. Strong. Cheap. Pick two. Keith Bontrager - Bicycle Builder.


:

I have a query that someone had helped me with and it works great: However
when I run this against 91,000 records the query runs forever.

When running queries against thousands of records, is there a tricks to do
this effectively. Obviously the process and memory limits calculations, but
are there tricks to handling a lot of data?

Thanks

Dave
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top