1-in-N Selection in SQL

  • Thread starter Thread starter Neil Trotter
  • Start date Start date
N

Neil Trotter

Is there a way (in Access 2002) to select, say, 500 rows at random from
a dataset of several thousand using an SQL query?

I'm thinking along the lines of "order by random()" and "limit 500",
which might be one approach as expressed in MySQL for example. I'm not
sure that Access offers equivalent mechanisms however.

Ideas & suggestions welcome :-)
 
Is there a way (in Access 2002) to select, say, 500 rows at random from
a dataset of several thousand using an SQL query?

I'm thinking along the lines of "order by random()" and "limit 500",
which might be one approach as expressed in MySQL for example. I'm not
sure that Access offers equivalent mechanisms however.

Ideas & suggestions welcome :-)

The syntax is different but the logic is the same. Access has a Rnd()
function which returns a random number if it's passed a positive
argument - to use it in a query you MUST pass an argument, or Access
will "save time" by calling the function once and using the same value
in all rows. Assuming you have a numeric field, reliably positive,
named ID, try

SELECT TOP 500
field, field, field, field
FROM tablename
ORDER BY Rnd([ID]);

A little more work but maybe more general: copy this little routine
into a new VBA Module and use it:

Public Function RndNum(vIgnore As Variant) As Double
Static bRandomized As Boolean
If Not bRandomized Then ' initialize random function once only
Randomize
bRandomized = False
End If
RndNum = Rnd()
End Function

and pass the name of any field from the query as an argument (just to
make it call for each row):

SELECT TOP 500
field, field, field, field
FROM tablename
ORDER BY Rnd([AnyFieldName])

John W. Vinson[MVP]
John W. Vinson[MVP]
 
John Vinson said:
The syntax is different but the logic is the same. Access has a Rnd()
function which returns a random number if it's passed a positive
argument - to use it in a query you MUST pass an argument, or Access
will "save time" by calling the function once and using the same value
in all rows. Assuming you have a numeric field, reliably positive,
named ID, try

SELECT TOP 500
field, field, field, field
FROM tablename
ORDER BY Rnd([ID]);

A little more work but maybe more general: copy this little routine
into a new VBA Module and use it:

Public Function RndNum(vIgnore As Variant) As Double
Static bRandomized As Boolean
If Not bRandomized Then ' initialize random function once only
Randomize
bRandomized = False
End If
RndNum = Rnd()
End Function

and pass the name of any field from the query as an argument (just to
make it call for each row):

SELECT TOP 500
field, field, field, field
FROM tablename
ORDER BY Rnd([AnyFieldName])

Many thanks for this, John, that's exactly what I was looking for. The
2nd example is more useful in my case, as I am wanting to perform 1-in-N
selections of address files which (at this stage) have no primary key.

A couple of minor points which may be of use to others reading the
thread: The once-only randomization in the VBA function probably ought
to be setting bRandomised to True in order to Randomize once only; and
(trivially but vitally) the query should invoke the VBA function RndNum
in the ORDER BY clause.
 
Back
Top