Selecting records using GROUP BY

J

Jason

I’m trying to use Access 2003 to run reports from a SQL Server 2005 back end
and I have a problem:
I have a table, dbo_ApplApplicantEmployers

Users can input multiple employers for each applicant, each employer going
into its own record. From there the employer will be designated as “Full
Time†or “Part Time†and “Current†or “Previousâ€.
My problem is that, for this particular report, I need to be able to pull
just one employer per applicant. Easy enough up to this point… But I want to
be able to specify the “best†employer from the available options… in other
words the one that is Full Time and Current (if it exists), and work back
wards from there if that combination is not available.

I am able to sort the records so that they appear in the proper order but
when I attempt to apply a GROUP BY and use the FIRST in the SELECT statement,
it is disregarding the sort order and pulling the 1st record in the database
table.

Any ideas would be welcomed and VERY appreciated

Thanks in advance,
Jason
 
P

Peter Yang[MSFT]

Hello Jason,


You may want to use subquery to see if it meets your requirement. column2
is the original the column you'd like to group by.


SELECT *
FROM tbl t1
WHERE t1.colum1 = ( SELECT TOP 1 t2.column1
FROM tbl t2
WHERE t2.colum2 = t1.column2 ) ;


If anything is unclear, please feel free to let's know. Thank you.

Best Regards,

Peter Yang
MCSE2000/2003, MCSA, MCDBA
Microsoft Online Community Support
==================================================
Get notification to my posts through email? Please refer to
http://msdn.microsoft.com/subscriptions/managednewsgroups/default.aspx#notif
ications
<http://msdn.microsoft.com/subscriptions/managednewsgroups/default.aspx>.
Note: The MSDN Managed Newsgroup support offering is for non-urgent issues
where an initial response from the community or a Microsoft Support
Engineer within 1 business day is acceptable. Please note that each follow
up response may take approximately 2 business days as the support
professional working with you may need further investigation to reach the
most efficient resolution. The offering is not appropriate for situations
that require urgent, real-time or phone-based interactions or complex
project analysis and dump analysis issues. Issues of this nature are best
handled working with a dedicated Microsoft Support Engineer by contacting
Microsoft Customer Support Services (CSS) at
<http://msdn.microsoft.com/subscriptions/support/default.aspx>.
==================================================
This posting is provided "AS IS" with no warranties, and confers no rights.
 
G

Gary Walter

Jason said:
I'm trying to use Access 2003 to run reports from a SQL Server 2005 back
end
and I have a problem:
I have a table, dbo_ApplApplicantEmployers

Users can input multiple employers for each applicant, each employer going
into its own record. From there the employer will be designated as "Full
Time" or "Part Time" and "Current" or "Previous".
My problem is that, for this particular report, I need to be able to pull
just one employer per applicant. Easy enough up to this point. But I want
to
be able to specify the "best" employer from the available options. in
other
words the one that is Full Time and Current (if it exists), and work back
wards from there if that combination is not available.

I am able to sort the records so that they appear in the proper order but
when I attempt to apply a GROUP BY and use the FIRST in the SELECT
statement,
it is disregarding the sort order and pulling the 1st record in the
database
table.
Hi Jason,

In addition to Peter's sage help,
I might think about creating a small
employer rank table in your BE.

I assume one field in table can be "full or part,"
and another field can be "current or previous"?

tblEmployerRank
Rank f1 f2
1 Full Time Current
{etc.}

SELECT *
FROM tbl t1
WHERE t1.pk = ( SELECT TOP 1 t2.pk
FROM
tbl t2
INNER JOIN
tblEmployerRank AS R
ON
t2.f1 = R.f1
AND
t2.f2 = R.f2
WHERE
t2.column2 = t1.column2
ORDER BY R.Rank);

the correlating "column2" may actually be more
than one column (the column(s) you originally
grouped by)...

Apologies for butting in (especially if misunderstood).

good luck,

gary
 
J

Jason

As I said in my original post, ALL help is VERY appreciated, so thank you to
both of you. I haven’t had much experience using sub queries so this will be
new ground for me. I’ll let you know if I have any success (also if I need
more help). As far as adding a new table to the back end, that won’t happen.
I have read only access to the database.
 
G

Gary Walter

Jason said:
As I said in my original post, ALL help is VERY appreciated, so thank you
to
both of you. I haven't had much experience using sub queries so this will
be
new ground for me. I'll let you know if I have any success (also if I need
more help). As far as adding a new table to the back end, that won't
happen.
I have read only access to the database.

Hi Jason,

Maybe should provide the field names/field type of your table
with maybe some sample data for pertinent fields.

My *guess* is that you could use a SWITCH() function
in a query to still produce this "rank" in a separate query

qryEmployerRank

SELECT
pk,
column2,
SWITCH([f1]='Full Time' AND [f2]='Current',1,
[f1]='Part Time' AND [f2]='Current',2,
[f1]='Full Time' AND [f2]='Previous',3,
[f1]='Part Time' AND [f2]='Previous',4,
True,5) AS Rank
FROM
dbo_ApplApplicantEmployers;

{if that is the "rank" as you see it to be}

then use that query in your correlating subquery...

SELECT *
FROM dbo_ApplApplicantEmployers AS t1
WHERE t1.pk = ( SELECT TOP 1 t2.pk
FROM
qryEmployerRank AS t2
WHERE
t2.column2 = t1.column2
ORDER BY t2.Rank);
 
J

Jason

I’ve been looking at the subquery example you provided and either I’m not
making proper sense of it or it won’t work for my purposes. All of the data
is stored in just one table. The required fields are as follows:
AppEmployerID … Number … Primary Key
ApplicationID … Number
ApplicantID … Number
CurrentEmployerInd … Boolean … -1 = Current / 0 = Previous
EmploymentType … Number … 1 = Full Time / 2 = Part Time

SELECT dbo_ApplApplicantEmployers.ApplicantEmployerID,
dbo_ApplApplicantEmployers.ApplicationID,
dbo_ApplApplicantEmployers.ApplicantID,
dbo_ApplApplicantEmployers.CurrentEmployerInd,
dbo_ApplApplicantEmployers.EmploymentTypeID
FROM dbo_ApplApplicantEmployers
ORDER BY dbo_ApplApplicantEmployers.ApplicationID,
dbo_ApplApplicantEmployers.ApplicantID,
dbo_ApplApplicantEmployers.CurrentEmployerInd,
dbo_ApplApplicantEmployers.EmploymentTypeID;

I then need to GROUP BY dbo_ApplApplicantEmployers.ApplicationID and
dbo_ApplApplicantEmployers.ApplicantID

Basically each application can have multiple applicants and each applicant
and have multiple applications. The only records that I want to eliminate are
those that have the same application id and applicant id numbers… keeping, of
course, the record that comes closest to current and full time.
 
G

Gary Walter

Jason said:
I've been looking at the subquery example you provided and either I'm not
making proper sense of it or it won't work for my purposes. All of the
data
is stored in just one table. The required fields are as follows:
AppEmployerID . Number . Primary Key
ApplicationID . Number
ApplicantID . Number
CurrentEmployerInd . Boolean . -1 = Current / 0 = Previous
EmploymentType . Number . 1 = Full Time / 2 = Part Time

is that EmploymentTypeID?
and ApplicationEmployerID (the pk)?

isn't True in SQL Server = 1 (not -1)?
SELECT dbo_ApplApplicantEmployers.ApplicantEmployerID,
dbo_ApplApplicantEmployers.ApplicationID,
dbo_ApplApplicantEmployers.ApplicantID,
dbo_ApplApplicantEmployers.CurrentEmployerInd,
dbo_ApplApplicantEmployers.EmploymentTypeID
FROM dbo_ApplApplicantEmployers
ORDER BY dbo_ApplApplicantEmployers.ApplicationID,
dbo_ApplApplicantEmployers.ApplicantID,
dbo_ApplApplicantEmployers.CurrentEmployerInd,
dbo_ApplApplicantEmployers.EmploymentTypeID;

I then need to GROUP BY dbo_ApplApplicantEmployers.ApplicationID and
dbo_ApplApplicantEmployers.ApplicantID

Basically each application can have multiple applicants and each applicant
and have multiple applications. The only records that I want to eliminate
are
those that have the same application id and applicant id numbers. keeping,
of
course, the record that comes closest to current and full time.

Sorry for confusion Jason,

does this work for you?
(w/o side comments in subquery ORDER BY)

SELECT
A.ApplicationID,
A.ApplicantID,
FIRST(A.ApplicantEmployerID),
FIRST(A.CurrentEmployerInd),
FIRST(A.EmploymentTypeID)
FROM
dbo_ApplApplicantEmployers AS A
WHERE
A.ApplicantEmployerID =
(SELECT TOP 1 Q.ApplicantEmployerID
FROM
dbo_ApplApplicantEmployers AS Q
WHERE
Q.ApplicationID = A.ApplicationID
AND
Q.ApplicantID = A.ApplicantID
ORDER BY
Q.CurrentEmployerInd DESC, 'Boolean 1 = Current / 0 = Previous
Q.EmploymentType) 'Number 1 = Full Time / 2 = Part Time
GROUP BY
A.ApplicationID,
A.ApplicantID,
ORDER BY
A.ApplicationID,
A.ApplicantID;
 
J

Jason

Gary,
I’m just not getting any love from the subqueries!!! I keep getting an error
that says that the subquery will return a max of 1 record… even when I drop
out the TOP1 part of it. The good news is that I did figure out a working
solution using 3 separate queries. The 1st was created using 2 concatenated
fields: GROUP BY (ApplicationID & ApplicantID) and Min (Rank &
ApplicantEmployerID). I used the switch formula you showed me for the Rank, I
love it. This gave me the basis for the record set I am looking for. The 2nd
query was used to break out the ApplicantEmployerID field and to convert it
back to a number. For the 3rd and final query, I used a LEFT JOIN on the 2nd
query back to the original table to limit my record selection.
Not the most elegant approach but it works. I just wanted you thank you for
your help. You’ve also open my eyes to the power of using subqueries. I think
I’ll be doing some reading in the near future. Any titles or authors you
would suggest on the topic?

Jason
 
G

Gary Walter

Jason said:
I'm just not getting any love from the subqueries!!! I keep getting an
error
that says that the subquery will return a max of 1 record. even when I
drop
out the TOP1 part of it. The good news is that I did figure out a working
solution using 3 separate queries. The 1st was created using 2
concatenated
fields: GROUP BY (ApplicationID & ApplicantID) and Min (Rank &
ApplicantEmployerID). I used the switch formula you showed me for the
Rank, I
love it. This gave me the basis for the record set I am looking for. The
2nd
query was used to break out the ApplicantEmployerID field and to convert
it
back to a number. For the 3rd and final query, I used a LEFT JOIN on the
2nd
query back to the original table to limit my record selection.
Not the most elegant approach but it works. I just wanted you thank you
for
your help. You've also open my eyes to the power of using subqueries. I
think
I'll be doing some reading in the near future. Any titles or authors you
would suggest on the topic?
Hi Jason,

I don't know...

I had a nephew who was having trouble in a college Access class
and I got him the 2 "for Mere Motals" books (1 for SQL, 1 for db Design).
We went through some stuff that is just hard to get at first, I left him
with the
2 books, and he ended up acing the class.

Allen Browne has some good demonstrations on his excellent website.

I started out with the "Access 97 Developer's Handbook" which took
a brief look at SQL in an early chapter.

I had a query I could not figure out and came to this newsgroup.

I stuck around. Most of what I know I really learned from the
John's, and Duane, and Michel, and many others on this site
as I tried to figure out the help they gave to posters.

With correlated subqueries the lightbulb came on for me
when I "slowed down" the process to thinking about every
record produced by the join(s), and if used in WHERE clause,
asked myself how do I determine if return "this record" in
my final result.

for example, in our query that you tried

SELECT
A.ApplicationID,
A.ApplicantID,
FIRST(A.ApplicantEmployerID),
FIRST(A.CurrentEmployerInd),
FIRST(A.EmploymentTypeID)
FROM
dbo_ApplApplicantEmployers AS A
WHERE
A.ApplicantEmployerID =
(SELECT TOP 1 Q.ApplicantEmployerID
FROM
dbo_ApplApplicantEmployers AS Q
WHERE
Q.ApplicationID = A.ApplicationID
AND
Q.ApplicantID = A.ApplicantID
ORDER BY
Q.CurrentEmployerInd DESC,
Q.EmploymentType)
GROUP BY
A.ApplicationID,
A.ApplicantID,
ORDER BY
A.ApplicationID,
A.ApplicantID;

if you are "looking" at a record in main query
that has values for "ID's"

A.ApplicationID = 243
A.ApplicantID = 65

you could probably easily create "subquery"
as a separate query using those values.

SELECT TOP 1 Q.ApplicantEmployerID
FROM
dbo_ApplApplicantEmployers AS Q
WHERE
Q.ApplicationID = 243
AND
Q.ApplicantID = 65
ORDER BY
Q.CurrentEmployerInd DESC,
Q.EmploymentType

But we want to do this for "every record" we
"look at" in the main query, so instead of using
actual values for ApplicationID and ApplicantID,

we "correlate" back to the main query each time
to get those values

(SELECT TOP 1 Q.ApplicantEmployerID
FROM
dbo_ApplApplicantEmployers AS Q
WHERE
Q.ApplicationID = A.ApplicationID
AND
Q.ApplicantID = A.ApplicantID
ORDER BY
Q.CurrentEmployerInd DESC,
Q.EmploymentType)

Does that help?

good luck,

gary
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top