Help on SQL:Selecting most frequent records

M

mota

Hello;
In a medical system,we have a table carrying all medicines of all
prescriptions of the current month.All data are saved as numeric,so we have
some look up tables for physicians name and ID,Drugs name and ID,and this
IDs are saved in our tables,rather than names.Fields of this table are as
follow:
RecID: an autonumber field just for having a primary key in table
PID: or physicianID that shows ID of the prescriber of each drug
DID: or DrugID that represents the ID of medicine that is prescribed by this
physician
Dnu: that is the quantity of this drug in its relative prescription (has
less importance in my query)
some other fields are TotalPrice,Date,PrescriptionID,InsuranceType and so
on.
As an example,we may have more than 1000 records relative to a specific
physician having PID=26,and 100 of this records can be for the drug Advil
(with DID=5),and near 80 records for Tylenol (DID=260),and so on.That means
Advil is 100 times prescribed by this physician in this month,and Tylenol
about 60 times and in 60 prescription.
Now,at the end of each month we want to know which medicines are more
interested by a specific physician.In other word,we want to select at least
top 20 medicines that are more frequent by the given physician (PID can be
passed to a function or query as its parameter).
The resulting query must have at least 1 field: DID (ID for more frequent
drug) and sorted Descending

So,in the example above,Advil must be the first selected record and Tylenol
the 2nd one.
Must consider that selecting this 20 records is kinda DistinctRow,otherwise
all 20 resulting records will be Advil !
I believe that having a new calculated field in resulting records,[Time of
Repeating],would complicate my SQL.But i will be glad if it is possible to
having that.
Can anyone please help me?I will so much grateful to you and thank you in
advance.
 
M

Marshall Barton

mota said:
In a medical system,we have a table carrying all medicines of all
prescriptions of the current month.All data are saved as numeric,so we have
some look up tables for physicians name and ID,Drugs name and ID,and this
IDs are saved in our tables,rather than names.Fields of this table are as
follow:
RecID: an autonumber field just for having a primary key in table
PID: or physicianID that shows ID of the prescriber of each drug
DID: or DrugID that represents the ID of medicine that is prescribed by this
physician
Dnu: that is the quantity of this drug in its relative prescription (has
less importance in my query)
some other fields are TotalPrice,Date,PrescriptionID,InsuranceType and so
on.
As an example,we may have more than 1000 records relative to a specific
physician having PID=26,and 100 of this records can be for the drug Advil
(with DID=5),and near 80 records for Tylenol (DID=260),and so on.That means
Advil is 100 times prescribed by this physician in this month,and Tylenol
about 60 times and in 60 prescription.
Now,at the end of each month we want to know which medicines are more
interested by a specific physician.In other word,we want to select at least
top 20 medicines that are more frequent by the given physician (PID can be
passed to a function or query as its parameter).
The resulting query must have at least 1 field: DID (ID for more frequent
drug) and sorted Descending

So,in the example above,Advil must be the first selected record and Tylenol
the 2nd one.
Must consider that selecting this 20 records is kinda DistinctRow,otherwise
all 20 resulting records will be Advil !
I believe that having a new calculated field in resulting records,[Time of
Repeating],would complicate my SQL.But i will be glad if it is possible to
having that.


I doubt that I followed all that, especially when you say
you don't want a calculated field.

The way I read it, I think you need to use a series of
queries:

qryDIDcounts:
SELECT PID, DID, Count(*) As DIDcount
FROM maintable
GROUP BY PID, DID

qryDIDranking:
SELECT PID, DID, DIDcount,
(SELECT Count(*)
FROM qryDIDcounts AS T
WHERE T.DIDcount <= A.DIDcount
) AS DIDrank
FROM qryDIDcounts AS A

qryTop20:
SELECT PID, DID, DIDCOUNT, DIDrank
FROM qryDIDranking
WHERE DIDrank <= 20

Note that the above is just untested air code intended to
provide an outline of an approach. It is not a copy/paste
solution.
 
M

mota

Dear marshal;
Its a good solution,but since we want to work on prescribers one by one,i
have to assign a SQL statement to a query (as its SQL property) in my
code.So its better to have one SQL statement.Otherwise,i must filter 3 SQL
sentence for one physician,before executing the last one.For this reason,i
worked on the first query of your reply,tried to sort and filter it as i
need,and kinda closed to my answer.Now only i have to limit that query to 20
records.
What i misunderstood and may help me find a better solution,is DIDranking in
the second query.I think interpreting it may lead me to a better
solution.What does this field do for us?
Thank you so much for your time and help.


Marshall Barton said:
mota said:
In a medical system,we have a table carrying all medicines of all
prescriptions of the current month.All data are saved as numeric,so we have
some look up tables for physicians name and ID,Drugs name and ID,and this
IDs are saved in our tables,rather than names.Fields of this table are as
follow:
RecID: an autonumber field just for having a primary key in table
PID: or physicianID that shows ID of the prescriber of each drug
DID: or DrugID that represents the ID of medicine that is prescribed by this
physician
Dnu: that is the quantity of this drug in its relative prescription (has
less importance in my query)
some other fields are TotalPrice,Date,PrescriptionID,InsuranceType and so
on.
As an example,we may have more than 1000 records relative to a specific
physician having PID=26,and 100 of this records can be for the drug Advil
(with DID=5),and near 80 records for Tylenol (DID=260),and so on.That means
Advil is 100 times prescribed by this physician in this month,and Tylenol
about 60 times and in 60 prescription.
Now,at the end of each month we want to know which medicines are more
interested by a specific physician.In other word,we want to select at least
top 20 medicines that are more frequent by the given physician (PID can be
passed to a function or query as its parameter).
The resulting query must have at least 1 field: DID (ID for more frequent
drug) and sorted Descending

So,in the example above,Advil must be the first selected record and Tylenol
the 2nd one.
Must consider that selecting this 20 records is kinda DistinctRow,otherwise
all 20 resulting records will be Advil !
I believe that having a new calculated field in resulting records,[Time of
Repeating],would complicate my SQL.But i will be glad if it is possible to
having that.


I doubt that I followed all that, especially when you say
you don't want a calculated field.

The way I read it, I think you need to use a series of
queries:

qryDIDcounts:
SELECT PID, DID, Count(*) As DIDcount
FROM maintable
GROUP BY PID, DID

qryDIDranking:
SELECT PID, DID, DIDcount,
(SELECT Count(*)
FROM qryDIDcounts AS T
WHERE T.DIDcount <= A.DIDcount
) AS DIDrank
FROM qryDIDcounts AS A

qryTop20:
SELECT PID, DID, DIDCOUNT, DIDrank
FROM qryDIDranking
WHERE DIDrank <= 20

Note that the above is just untested air code intended to
provide an outline of an approach. It is not a copy/paste
solution.
 
M

Marshall Barton

The ranking query only assigns a rank number based on the
wuantity of each DID so that the next query can limit the
results to the top 20. It's not 100% reliable in the sense
that it can not separate ties if there are more than on DID
in the 20th position (nor can any other approach for that
matter).

But none of that may matter if you on want the result for a
single PID. Dealing with multiple PIDs was the cause of the
complexity in my previous reply. How about giving this kind
of query a try:

SELECT TOP 20 DID, Count(DID) AS DIDcount
FROM maintable
WHERE PID = Forms!someform.cboPhysicians
GROUP BY DID
ORDER BY Count(DID) DESC;

Note that I used a reference to a control on a form to
select the PID so you would not have to modify a the query's
SQL property.
--
Marsh
MVP [MS Access]


Its a good solution,but since we want to work on prescribers one by one,i
have to assign a SQL statement to a query (as its SQL property) in my
code.So its better to have one SQL statement.Otherwise,i must filter 3 SQL
sentence for one physician,before executing the last one.For this reason,i
worked on the first query of your reply,tried to sort and filter it as i
need,and kinda closed to my answer.Now only i have to limit that query to 20
records.
What i misunderstood and may help me find a better solution,is DIDranking in
the second query.I think interpreting it may lead me to a better
solution.What does this field do for us?

mota said:
In a medical system,we have a table carrying all medicines of all
prescriptions of the current month.All data are saved as numeric,so we have
some look up tables for physicians name and ID,Drugs name and ID,and this
IDs are saved in our tables,rather than names.Fields of this table are as
follow:
RecID: an autonumber field just for having a primary key in table
PID: or physicianID that shows ID of the prescriber of each drug
DID: or DrugID that represents the ID of medicine that is prescribed by this
physician
Dnu: that is the quantity of this drug in its relative prescription (has
less importance in my query)
some other fields are TotalPrice,Date,PrescriptionID,InsuranceType and so
on.
As an example,we may have more than 1000 records relative to a specific
physician having PID=26,and 100 of this records can be for the drug Advil
(with DID=5),and near 80 records for Tylenol (DID=260),and so on.That means
Advil is 100 times prescribed by this physician in this month,and Tylenol
about 60 times and in 60 prescription.
Now,at the end of each month we want to know which medicines are more
interested by a specific physician.In other word,we want to select at least
top 20 medicines that are more frequent by the given physician (PID can be
passed to a function or query as its parameter).
The resulting query must have at least 1 field: DID (ID for more frequent
drug) and sorted Descending

So,in the example above,Advil must be the first selected record and Tylenol
the 2nd one.
Must consider that selecting this 20 records is kinda DistinctRow,otherwise
all 20 resulting records will be Advil !
I believe that having a new calculated field in resulting records,[Time of
Repeating],would complicate my SQL.But i will be glad if it is possible to
having that.

"Marshall Barton" wrote
I doubt that I followed all that, especially when you say
you don't want a calculated field.

The way I read it, I think you need to use a series of
queries:

qryDIDcounts:
SELECT PID, DID, Count(*) As DIDcount
FROM maintable
GROUP BY PID, DID

qryDIDranking:
SELECT PID, DID, DIDcount,
(SELECT Count(*)
FROM qryDIDcounts AS T
WHERE T.DIDcount <= A.DIDcount
) AS DIDrank
FROM qryDIDcounts AS A

qryTop20:
SELECT PID, DID, DIDCOUNT, DIDrank
FROM qryDIDranking
WHERE DIDrank <= 20

Note that the above is just untested air code intended to
provide an outline of an approach. It is not a copy/paste
solution.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top