Record source at runtime

B

BruceM

I have read that one of the ways to improve performance is to set a form's
record source in the Load event rather than by setting the Record Source
property. Do I understand correctly, and are there caveats to this
approach?
If this approach makes good sense, I am wondering about which approach to
take to open a form (I will call it frmTwo) to a particular record that
corresponds to the current record on the main form. Should the WHERE
condition be part of the recordsource SQL in the Load event for frmTwo?

dim strSQL as String
strSQL = "SELECT MainID, TwoText FROM tblTwo " & _
"WHERE MainID = " & Forms!frmMain!MainID
Me.RecordSource = strSQL

Or should the recordsource SQL be thus:

dim strSQL as String
strSQL = "SELECT MainID, TwoText FROM tblTwo "
Me.RecordSource = strSQL

with the specific record code in the OpenForm action?
DoCmd.OpenForm "frmTwo", , , "[MainID] = " & MainID

In either case the frmTwo is being opened from a command button on frmMain.

If setting the record source at runtime is a worthwhile approach, is the
Load event the place to do that? If there are related comments or
observations I did not raise with my specific questions I would appreciate
hearing about them.
 
A

Arvin Meyer [MVP]

The only time I think it may be advantageous to set a recordsource at
runtime is if there are several subforms which are not displayed. In that
case, I set the recordsource on-the-fly as each subform becomes visible (tab
control. or multipage form).
 
M

Marshall Barton

BruceM said:
I have read that one of the ways to improve performance is to set a form's
record source in the Load event rather than by setting the Record Source
property. Do I understand correctly, and are there caveats to this
approach?
If this approach makes good sense, I am wondering about which approach to
take to open a form (I will call it frmTwo) to a particular record that
corresponds to the current record on the main form. Should the WHERE
condition be part of the recordsource SQL in the Load event for frmTwo?

dim strSQL as String
strSQL = "SELECT MainID, TwoText FROM tblTwo " & _
"WHERE MainID = " & Forms!frmMain!MainID
Me.RecordSource = strSQL

Or should the recordsource SQL be thus:

dim strSQL as String
strSQL = "SELECT MainID, TwoText FROM tblTwo "
Me.RecordSource = strSQL

with the specific record code in the OpenForm action?
DoCmd.OpenForm "frmTwo", , , "[MainID] = " & MainID

In either case the frmTwo is being opened from a command button on frmMain.

If setting the record source at runtime is a worthwhile approach, is the
Load event the place to do that? If there are related comments or
observations I did not raise with my specific questions I would appreciate
hearing about them.

The most important aspect to form/network performance is to
filter form/list/combo datasets to as few records as
required to do the job (often zero or one trcord). Setting
the record source is one way of doing that, but often there
are other, easier ways. Using the OpenForm/Report method's
WhereCondition argument is a common and effective approach.
In an unusual case where playing around with the
RecordSource property is needed, you would normally use the
Open event. However there are other, even more uncommon,
situations where changing the record source on the fly in
any appropriate event is needed.

I think there are more cases where setting a list combo
box's RowSource is appropriate, but certainly not a general
rule.

My bottom line is that you should not worry about it until
you run into a situation that requires a closer look. Even
then the first thing to look at is finding a way to reduce
the size of the dataset.
 
B

BruceM

Thanks to both for your replies. I think I read something with a fairly
specific application and tried to turn it into a general rule. It seems I
have been overthinking this, so I will keep it as it was (using the Where
part of the OpenForm method). It wasn't giving me any trouble, and I was
indeed limiting the recordset to a single record, but I wondered if there
was a better way.
Arvin, I see what you mean about something such as a tab control that
contains several subforms. That makes sense to me.
In keeping with the general practice of limiting the size of the recordset,
I suppose it would be best to limit the fields to just those I need rather
than using the whole table as the Record Source. Probably not a big deal in
this case, but if it is a best practice I may as well do it that way.
Thanks again.


Marshall Barton said:
BruceM said:
I have read that one of the ways to improve performance is to set a form's
record source in the Load event rather than by setting the Record Source
property. Do I understand correctly, and are there caveats to this
approach?
If this approach makes good sense, I am wondering about which approach to
take to open a form (I will call it frmTwo) to a particular record that
corresponds to the current record on the main form. Should the WHERE
condition be part of the recordsource SQL in the Load event for frmTwo?

dim strSQL as String
strSQL = "SELECT MainID, TwoText FROM tblTwo " & _
"WHERE MainID = " & Forms!frmMain!MainID
Me.RecordSource = strSQL

Or should the recordsource SQL be thus:

dim strSQL as String
strSQL = "SELECT MainID, TwoText FROM tblTwo "
Me.RecordSource = strSQL

with the specific record code in the OpenForm action?
DoCmd.OpenForm "frmTwo", , , "[MainID] = " & MainID

In either case the frmTwo is being opened from a command button on
frmMain.

If setting the record source at runtime is a worthwhile approach, is the
Load event the place to do that? If there are related comments or
observations I did not raise with my specific questions I would appreciate
hearing about them.

The most important aspect to form/network performance is to
filter form/list/combo datasets to as few records as
required to do the job (often zero or one trcord). Setting
the record source is one way of doing that, but often there
are other, easier ways. Using the OpenForm/Report method's
WhereCondition argument is a common and effective approach.
In an unusual case where playing around with the
RecordSource property is needed, you would normally use the
Open event. However there are other, even more uncommon,
situations where changing the record source on the fly in
any appropriate event is needed.

I think there are more cases where setting a list combo
box's RowSource is appropriate, but certainly not a general
rule.

My bottom line is that you should not worry about it until
you run into a situation that requires a closer look. Even
then the first thing to look at is finding a way to reduce
the size of the dataset.
 
A

Arvin Meyer [MVP]

Arvin, I see what you mean about something such as a tab control that
contains several subforms. That makes sense to me.
In keeping with the general practice of limiting the size of the
recordset, I suppose it would be best to limit the fields to just those I
need rather than using the whole table as the Record Source. Probably not
a big deal in this case, but if it is a best practice I may as well do it
that way.
Thanks again.

The way Jet works is to go to the server, find the table(s) and pull the
index. If no index is requested, it will pull the entire table of data.
Assuming good design, and therefore correct indexing, once the index is
satisfied, it returns to the server and gets just the row(s) of data
requested. Limiting the fields doesn't matter much unless they are OLE
fields (images, hyperlinks, memos) Those are stored on a different place in
the database and aren't called unless requested.
 
B

BruceM

Arvin Meyer said:
The way Jet works is to go to the server, find the table(s) and pull the
index. If no index is requested, it will pull the entire table of data.
Assuming good design, and therefore correct indexing, once the index is
satisfied, it returns to the server and gets just the row(s) of data
requested. Limiting the fields doesn't matter much unless they are OLE
fields (images, hyperlinks, memos) Those are stored on a different place
in the database and aren't called unless requested.
--
Arvin Meyer, MCP, MVP
http://www.datastrat.com
http://www.mvps.org/access
http://www.accessmvp.com
Thanks for the follow-up, and for clarifying that limiting the fields makes
little difference in most cases, but I have to say I'm not sure just what
you mean by "pull the index" and "if no index is requested".
 
A

Arvin Meyer [MVP]

If you have defined a primary key, always include that in your query. to
speed things up other indices are often added like last name, first name,
etc. When using those indexes, always use them in the order they are created
(check the table index list) Like:

Select PersonID, LastName, FirstName From tblPeople Where LastName = "Meyer"
And FirstName = "Arvin";

Notice, I've used the PK and the index in the order I created them. That's
the way most (and for sure both Access and SQL-Server) use an index to look
for data.

In the case of Access, It see's that the PK isn't in the Where Clause, but
is in the data requested, so it will return the PKs from those rows that use
"Meyer", then it will see that it needs to go back and return the data
requested in the 3 fields from those rows.

SQL-Server does something similar, except since it's running on the engine,
it doesn't need to return the index first, it processes that on the server
and returns the data.

In both cases, had I written the query:

Select PersonID, FirstName, LastName From tblPeople Where FirstName =
"Arvin" And LastName = "Meyer";

It would still look for the index on LastName before it did on FirstName,
taken several milliseconds before it moved to the next index. In the above
case, it wouldn't be noticeable, but it sure would if you were looking for
John Smith in a couple of hundred thousand records.
--
Arvin Meyer, MCP, MVP
http://www.datastrat.com
http://www.mvps.org/access
http://www.accessmvp.com
 
M

Marshall Barton

Arvin said:
The way Jet works is to go to the server, find the table(s) and pull the
index. If no index is requested, it will pull the entire table of data.
Assuming good design, and therefore correct indexing, once the index is
satisfied, it returns to the server and gets just the row(s) of data
requested. Limiting the fields doesn't matter much unless they are OLE
fields (images, hyperlinks, memos) Those are stored on a different place in
the database and aren't called unless requested.

True, but I think it is still a good idea to avoid
retrieving fields that are not used.
 
B

BruceM

Thanks again. I didn't realize some of what you said about indexes. I have
more than one query that is lacking the PK field. It probably matters
little with a small number of records, but best practice is best practice.
Also, I didn't know that about the index order. I had known that indexing
is important, but that is the best concise explanation I have heard about
why it is important. Thanks for taking the time to explain. Now there's
one more thing to think about. Funny about how that list keeps growing.
 
B

BruceM

Marshall Barton said:
True, but I think it is still a good idea to avoid
retrieving fields that are not used.

Thanks for the comments. I think what it probably comes down to is if I
have a named query for a particular record source or something, I can go
ahead and use it in a second situation even if it has a few more fields than
I need.
 
M

Marshall Barton

BruceM said:
Thanks for the comments. I think what it probably comes down to is if I
have a named query for a particular record source or something, I can go
ahead and use it in a second situation even if it has a few more fields than
I need.


Of course. Just don't get in the habit of using
SELECT tbl.*
without having a reason.
 
B

BruceM

Marshall Barton said:
Of course. Just don't get in the habit of using
SELECT tbl.*
without having a reason.

My reason for doing that is almost always testing in the development phase
of a project (especially with SQL used in VBA for a RecordSource or some
such, where I'm not energetic enough to type the whole thing until I'm sure
I will use it).
 
D

David W. Fenton

If you have defined a primary key, always include that in your
query. to speed things up other indices are often added like last
name, first name, etc. When using those indexes, always use them
in the order they are created (check the table index list) Like:

Select PersonID, LastName, FirstName From tblPeople Where LastName
= "Meyer" And FirstName = "Arvin";

Notice, I've used the PK and the index in the order I created
them. That's the way most (and for sure both Access and
SQL-Server) use an index to look for data.

In the case of Access, It see's that the PK isn't in the Where
Clause, but is in the data requested, so it will return the PKs
from those rows that use "Meyer", then it will see that it needs
to go back and return the data requested in the 3 fields from
those rows.

The wording of your two answers puzzles me. You seem to me to be
saying that including the PK in the SELECT clause improves
performance somehow. I can't see how.

Jet uses clustered indexes for the primary key, and that means that
the table's data is written in PK order after a compact. That means
that the data pages have to retrieved via the PK no matter what you
select on. That is, you have to go through the PK no matter *what*
criteria you use, since the data pages are stored according to the
PK.

So, what would happen is this (oversimplied for clarity):

1. retrieve indexes for Lastname and Firstname.

2. those indexes will point to the PKs of the records that match.

3. request the data pages for those PKs.

It may be that the PK index has to be retrieved, but I doubt it. I'd
think it would be retrieved only when there are criteria on the PK.
It depends on whether the non-primary indexes store pointers to the
PK values or pointers to the data pages. I'm not sure what actually
happens. If it's the former, then the PK index would be required to
find the actual data pages.

On the other hand, since the PK index is clustered, it may be that
the pages function as the index (using an offset to find the PK
values). I'm not sure about this.

But I'm pretty certain that the SELECT clause comes into effect only
after the minimal number of indexes have been used to decide which
data pages need to be retrieved (or, which PK values will be used to
decide which data pages to retrieve).
SQL-Server does something similar, except since it's running on
the engine, it doesn't need to return the index first, it
processes that on the server and returns the data.

In both cases, had I written the query:

Select PersonID, FirstName, LastName From tblPeople Where
FirstName = "Arvin" And LastName = "Meyer";

It would still look for the index on LastName before it did on
FirstName, taken several milliseconds before it moved to the next
index. In the above case, it wouldn't be noticeable, but it sure
would if you were looking for John Smith in a couple of hundred
thousand records.

Are you sure about that? I'd think that the query optimizer would
fix that for you. That is, it's likely to use the least sparse index
first.

Have you checked whether SHOWPLAN gives you any indication on
exactly what happens in regard to the order of filtering by indexes
in the same table?
 
D

David W. Fenton

I didn't realize some of what you said about indexes. I have
more than one query that is lacking the PK field. It probably
matters little with a small number of records, but best practice
is best practice. Also, I didn't know that about the index order.
I had known that indexing is important, but that is the best
concise explanation I have heard about why it is important.

I'm not certain that either of Arvin's points that you allude to
above is correct. His assertions are entirely news to me.
 
A

Arvin Meyer [MVP]

David W. Fenton said:
I'm not certain that either of Arvin's points that you allude to
above is correct. His assertions are entirely news to me.

Years ago in a SQL-Server design course the indexing plan was explained for
SQL-Server. The teacher had been worked with Fox and described Rushmore
Technology has having a similar query plan implementation. You can probably
test it yourself using ShowPlan.
 
A

Arvin Meyer [MVP]

The wording of your two answers puzzles me. You seem to me to be
saying that including the PK in the SELECT clause improves
performance somehow. I can't see how.

IIRC, here's the reasoning supplied to me. The primary index is the PK, if
you have the PK in the initial return of indexes, when it fetches the data,
it will use the PK to bring those rows back to the workstation. If you don't
have the PK, it will still find the data using the supplied index, but must
do a second scan, instead of going right to the row.
Jet uses clustered indexes for the primary key, and that means that
the table's data is written in PK order after a compact. That means
that the data pages have to retrieved via the PK no matter what you
select on. That is, you have to go through the PK no matter *what*
criteria you use, since the data pages are stored according to the
PK.

So, what would happen is this (oversimplied for clarity):

1. retrieve indexes for Lastname and Firstname.

2. those indexes will point to the PKs of the records that match.

3. request the data pages for those PKs.

That's how I understand it too.
It may be that the PK index has to be retrieved, but I doubt it. I'd
think it would be retrieved only when there are criteria on the PK.
It depends on whether the non-primary indexes store pointers to the
PK values or pointers to the data pages. I'm not sure what actually
happens. If it's the former, then the PK index would be required to
find the actual data pages.

I am pretty sure that the PK does need to be retrieved if it isn't in the
query.
On the other hand, since the PK index is clustered, it may be that
the pages function as the index (using an offset to find the PK
values). I'm not sure about this.

I'm not sure that the PK is clustered. Where did you read that?
Are you sure about that? I'd think that the query optimizer would
fix that for you. That is, it's likely to use the least sparse index
first.

As I remember, the query optimizers uses the index index (ie: index(0),
index(1), etc.)
Have you checked whether SHOWPLAN gives you any indication on
exactly what happens in regard to the order of filtering by indexes
in the same table?

No I haven't but if you have a large set of data, perhaps 100K rows, you
might see the way that ShowPlan looks at it.
 
A

Arvin Meyer [MVP]

Of course. Just don't get in the habit of using
SELECT tbl.*
without having a reason.

Especially if there are OLE field in the table. The only time I ever do
Select * is when I'm using the original data entry form.
 
D

David W. Fenton

IIRC, here's the reasoning supplied to me. The primary index is
the PK, if you have the PK in the initial return of indexes, when
it fetches the data, it will use the PK to bring those rows back
to the workstation. If you don't have the PK, it will still find
the data using the supplied index, but must do a second scan,
instead of going right to the row.

Seems to me it must *always* use the PK, no matter what (except for
a table scan, I guess), because the PK is the lookup for the data
pages (since the tables are stored in PK order). This has always
been my assumption, that, no matter what indexes are used for the
WHERE clause, those indexes are keyed to the PK value, which means
the PK index has to be used to find the data pages.

However, now that I think about it, I'm not entirely certain where
this idea came from.
That's how I understand it too.


I am pretty sure that the PK does need to be retrieved if it isn't
in the query.

Sure, but having it in the SELECT statement shouldn't have any
effect on how the query plan is processed. Really, the SELECT
statement is not operational until the data pages have already been
retrieved, and all it does is determine which fields are displayed.
I'm not sure that the PK is clustered. Where did you read that?

Jet Database Engine Programmer's Handbook, as well as many postings
in these newsgroups. I can't find it in the index of the book after
a quick scan, though.

Here's some documentation:

http://support.microsoft.com/default.aspx?id=137039

AND

http://support.microsoft.com/default.aspx?scid=kb;en-us;209769

Of course, the PK clustered index is not as full-featured a
clustered index as one finds in SQL Server, for example, and it
applies only to the PK, and it's not maintained except during a
compact, and it is perhaps not the kind where the lowest level of
the index contains the data pages.
As I remember, the query optimizers uses the index index (ie:
index(0), index(1), etc.)

But I just don't see a reason why having the PK in the SELECT makes
a difference at all. The Jet Programmer's Guide says nothing about
this at all, and I've never used it. When I tested SHOWPLAN below,
there was absolutely no difference in performance between the two on
a pretty large table (though it was running locally, not across a
network).
No I haven't but if you have a large set of data, perhaps 100K
rows, you might see the way that ShowPlan looks at it.

Hmm. I just checked on a table with 300-400K records, and SHOWPLAN
shows no difference for including the PK or not (which means
nothing). Showplan only documents index usage in joins, I guess.
 
D

David W. Fenton

Years ago in a SQL-Server design course the indexing plan was
explained for SQL-Server. The teacher had been worked with Fox and
described Rushmore Technology has having a similar query plan
implementation. You can probably test it yourself using ShowPlan.

I don't think Jet works the way you say. The documentation on
Rushmore in the Jet Database Engine Programmer's Guide says nothing
about the SELECT statement interacting with the query plan.

SHOWPLAN doesn't say what indexes it uses -- the plan is identical
regardless of the SELECT statement (which would lead me to believe
that the SELECT statement has no meaningful impact on query
optimization -- if it did, that would have to be included, no?).
 
B

BruceM

I have been following maybe 10% of this, but I'm starting to think I should
include the PK in the query when I specifically need that field. What I am
getting from the discussion (and other things I have read) is that it
probably does not make much difference on recordsets of the sizes that are
typical on my projects so far, and that proper indexing of other fields that
are actually used as output (e.g. LastName, FirstName) has more to do with
performance in any case.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top