SQL query/join syntax

S

Spektre

This is actually a MS Access/SQL, but used in conjunction with VB.net

This is a "simple" multitable join question. Some skeleton data to work
with is provided. Consider a database of collected classic television
episodes.

Table: Networks
NetworkID Network
1 NBC
2 CBS
3 ABC
4 PBS
5 FOX

Table: Series
SeriesID SeriesName NetworkID
1 Roseanne 3
2 Cheers 1
3 House 5

Table: Episodes
EpisodeID EpisodeName SeriesID
1 A New Becky 1
2 Roseanna Sings 1
3 Norm's Beer 2
4 Sammy Goes Long 2

Table: CollectedEpisodes
CollectedEpisodeID EpisodeID Format
1 1 VHS
2 1 DVD
3 3 TiVo
4 4 TiVo


In text, the tables define respectively, TV networks, Series Names, Episode
Names, and finally which episodes a person has collected. Going down the
list each table entry has a 1 to many mapping (ie many series per network,
many episodes per series, many possible collected episodes (via different
media) per episode.)

I would like a query that shows, for the entire collection (entries in
CollectedEpisodes), how many of the episodes are from each series...from each
network.

This for the skeleton data given above. The query would return.

Roseanne 2
Cheers 2
House 0

and

ABC 2
NBC 2

How would I accomplish that? Simple joins I understand...this one is beyond
me.

I would expect the answer to fall into the category of "here is the Join
syntax to return all records from 'NBC'" and then do a count, but I cannot
work out the syntax.


Also secondly, is one query possible that says

"For each series that has at least one Collected Episode, here are the
Episodes needed to complete the respective series."


Thanks in advance
 
K

Ken Snell \(MVP\)

For your first question (identifying number of series and networks), you
need to build two queries and save them:


qryDistinctNetworks

SELECT DISTINCT Networks.NetworkID, Networks.Network
FROM ((CollectedEpisodes INNER JOIN Episodes
ON CollectedEpisodes.EpisodeID = Episodes.EpisodeID)
INNER JOIN Series ON Series.SeriesID = Episodes.EpisodeID)
INNER JOIN Network ON Network.NetworkID = Series.NetworkID;




qryDistinctSeries

SELECT DISTINCT Series.SeriesID, Series.SeriesName
FROM (CollectedEpisodes INNER JOIN Episodes
ON CollectedEpisodes.EpisodeID = Episodes.EpisodeID)
INNER JOIN Series ON Series.SeriesID = Episodes.EpisodeID;


Then, these two queries should give you the desired counts:

SELECT (SELECT Count(Q.NetworkID) FROM qryDistinctNetworks Q
WHERE Q.NetworkID = qryDistinctNetworks.NetworkID) AS CountOfNetwork,
Network, NetworkID
FROM qryDistinctNetworks;



SELECT (SELECT Count(Q.SeriesID) FROM qryDistinctSeries Q
WHERE Q.SeriesID = qryDistinctSeries.SeriesID) AS CountOfSeries,
SeriesName, SeriesID
FROM qryDistinctSeries;



For the sdecond question (which episodes haven't been collected yet), use
this query:

SELECT Episodes.EpisodeID, Episodes.EpisodeName
FROM Episodes LEFT JOIN CollectedEpisodes
ON Episodes.EpisodeID = CollectedEpisodeID
WHERE CollectedEpisodeID IS NULL;
 
S

Spektre

Thanks for the help Ken,

My two follow up questions are:

1. How do you "save a query" in Visual BASIC such that you can query on a
query? I generally perform a query and then read the results with a
DataReader. I suppose this method will not work well for this.

2. The second query, from what I can see, does not do what was hoped for.
It gives back ALL episodes that are not collected whehter you have an episode
fomr a given series or not. The desired behaviour is to only give the
uncollected episodes for series that have at least one collected episode.

(Off to find out how to query a query in Visual Basic.)
 
K

Ken Snell \(MVP\)

Instead of saving the two queries that I outlined, you can use these two
queries for the first two results that you seek:

SELECT (SELECT Count(Q.NetworkID) FROM
(SELECT DISTINCT Series.SeriesID, Series.SeriesName
FROM (CollectedEpisodes INNER JOIN Episodes
ON CollectedEpisodes.EpisodeID = Episodes.EpisodeID)
INNER JOIN Series ON Series.SeriesID = Episodes.EpisodeID) Q
WHERE Q.NetworkID = qryDistinctNetworks.NetworkID) AS CountOfNetwork,
qryDistinctNetworks.Network, qryDistinctNetworks.NetworkID
FROM (SELECT DISTINCT Series.SeriesID, Series.SeriesName
FROM (CollectedEpisodes INNER JOIN Episodes
ON CollectedEpisodes.EpisodeID = Episodes.EpisodeID)
INNER JOIN Series ON Series.SeriesID = Episodes.EpisodeID)
qryDistinctNetworks;



SELECT (SELECT Count(Q.SeriesID) FROM
(SELECT DISTINCT Series.SeriesID, Series.SeriesName
FROM (CollectedEpisodes INNER JOIN Episodes
ON CollectedEpisodes.EpisodeID = Episodes.EpisodeID)
INNER JOIN Series ON Series.SeriesID = Episodes.EpisodeID) Q
WHERE Q.SeriesID = qryDistinctSeries.SeriesID) AS CountOfSeries,
qryDistinctSeries.SeriesName, qryDistinctSeries.SeriesID
FROM (SELECT DISTINCT Series.SeriesID, Series.SeriesName
FROM (CollectedEpisodes INNER JOIN Episodes
ON CollectedEpisodes.EpisodeID = Episodes.EpisodeID)
INNER JOIN Series ON Series.SeriesID = Episodes.EpisodeID)
qryDistinctSeries;




For the second question (which episodes haven't been collected yet), use
this query (I did indeed not filter out for noncollected episodes)

SELECT Episodes.EpisodeID, Episodes.EpisodeName
FROM Episodes LEFT JOIN CollectedEpisodes
ON Episodes.EpisodeID = CollectedEpisodeID
WHERE CollectedEpisodeID IS NULL AND
Episodes.SeriesID IN (SELECT DISTINCT Episode.SeriesID
FROM Episodes INNER JOIN CollectedEpisodes
ON Episodes.EpisodeID = CollectedEpisodes.EpisodeID);
 
S

Spektre

Thanks again Ken,

i am trying to pick apart the SELECT statements you provided to understand
how they will work. Some of the clauses just seem nonsensical to me.

For example in the 1st query:

"...INNER JOIN Series ON Series.SeriesID = Episodes.EpisodeID..."

This seems to be comparing apples to oranges. Series.SeriesID refers to a
key identifying a unique series. Episodes.EpisodeID refers to a key
identifying a unque episode. There will be cases of equivalence, but you are
comparing disparate item types. Did you mean Episodes.SeriesID?

The rest of it is going to take me days to deconstruct lol. I am not
familar with the AS keyword nor the use of "renaming" as you have done with Q
and qryDistinctNetworks.
 
K

Ken Snell \(MVP\)

My apologies; typos....

SELECT (SELECT Count(Q.NetworkID) FROM
(SELECT DISTINCT Series.SeriesID, Series.SeriesName
FROM (CollectedEpisodes INNER JOIN Episodes
ON CollectedEpisodes.EpisodeID = Episodes.EpisodeID)
INNER JOIN Series ON Series.SeriesID = Episodes.SeriesID) Q
WHERE Q.NetworkID = qryDistinctNetworks.NetworkID) AS CountOfNetwork,
qryDistinctNetworks.Network, qryDistinctNetworks.NetworkID
FROM (SELECT DISTINCT Series.SeriesID, Series.SeriesName
FROM (CollectedEpisodes INNER JOIN Episodes
ON CollectedEpisodes.EpisodeID = Episodes.EpisodeID)
INNER JOIN Series ON Series.SeriesID = Episodes.SeriesID)
qryDistinctNetworks;



SELECT (SELECT Count(Q.SeriesID) FROM
(SELECT DISTINCT Series.SeriesID, Series.SeriesName
FROM (CollectedEpisodes INNER JOIN Episodes
ON CollectedEpisodes.EpisodeID = Episodes.EpisodeID)
INNER JOIN Series ON Series.SeriesID = Episodes.SeriesID) Q
WHERE Q.SeriesID = qryDistinctSeries.SeriesID) AS CountOfSeries,
qryDistinctSeries.SeriesName, qryDistinctSeries.SeriesID
FROM (SELECT DISTINCT Series.SeriesID, Series.SeriesName
FROM (CollectedEpisodes INNER JOIN Episodes
ON CollectedEpisodes.EpisodeID = Episodes.EpisodeID)
INNER JOIN Series ON Series.SeriesID = Episodes.SeriesID)
qryDistinctSeries;


The "renaming" is using an alias for the query SQL statement. In SQL, one
can use AS or just leave it out when identifying an alias for a query:

SELECT (SELECT Count(Q.SeriesID) FROM
(SELECT DISTINCT Series.SeriesID, Series.SeriesName
FROM (CollectedEpisodes INNER JOIN Episodes
ON CollectedEpisodes.EpisodeID = Episodes.EpisodeID)
INNER JOIN Series ON Series.SeriesID = Episodes.SeriesID) Q
WHERE Q.SeriesID = qryDistinctSeries.SeriesID) AS CountOfSeries,
qryDistinctSeries.SeriesName, qryDistinctSeries.SeriesID
FROM (SELECT DISTINCT Series.SeriesID, Series.SeriesName
FROM (CollectedEpisodes INNER JOIN Episodes
ON CollectedEpisodes.EpisodeID = Episodes.EpisodeID)
INNER JOIN Series ON Series.SeriesID = Episodes.SeriesID)
qryDistinctSeries;

is the same as

SELECT (SELECT Count(Q.SeriesID) FROM
(SELECT DISTINCT Series.SeriesID, Series.SeriesName
FROM (CollectedEpisodes INNER JOIN Episodes
ON CollectedEpisodes.EpisodeID = Episodes.EpisodeID)
INNER JOIN Series ON Series.SeriesID = Episodes.SeriesID) AS Q
WHERE Q.SeriesID = qryDistinctSeries.SeriesID) AS CountOfSeries,
qryDistinctSeries.SeriesName, qryDistinctSeries.SeriesID
FROM (SELECT DISTINCT Series.SeriesID, Series.SeriesName
FROM (CollectedEpisodes INNER JOIN Episodes
ON CollectedEpisodes.EpisodeID = Episodes.EpisodeID)
INNER JOIN Series ON Series.SeriesID = Episodes.SeriesID)
AS qryDistinctSeries;


Put all together this way, the query is definitely not easy to quickly
understand. That is why I'd initially recommended saving the two original
queries. All I did to create these new queries was to replace the saved
query names with the full SQL statements in the "final" queries. It's still
possible that the queries will need some debugging (I didn't test them,
unfortunately), as I wrote them "off the top of my head". Let me know if you
find problems.
--

Ken Snell
<MS ACCESS MVP>
 
S

Spektre

No apologies necessary Ken, thanks for taking the time.

Another follow up question...

Dissecting the query it looks like "Q" and "qryDistinctNetworks" refer to
the same thing right?

Namely:

(SELECT DISTINCT Series.SeriesID, Series.SeriesName FROM (CollectedEpisodes
INNER JOIN Episodes ON CollectedEpisodes.EpisodeID = Episodes.EpisodeID)
INNER JOIN Series ON Series.SeriesID = Episodes.SeriesID)

right?




INNER JOIN Series ON Series.SeriesID = Episodes.Series
 
S

Spektre

At the risk of being a pest...

In the first "superquery" we get the network name from

qryDistinctNetworks.Network

However the alias qryDistinctNetworks is made up of joins from the tables
Series,Episodes, and CollectedEpisodes. The table Networks is not included
in the join so we cannot access Networks.Network can we?

Fred
 
S

Spektre

I think I see what happened. When you copied the original stored queries
into the 2nd queries, you copied the same stored query to each "superquery".
The network subquery is different.
 
K

Ken Snell \(MVP\)

Yep, still pesky typos:


SELECT (SELECT Count(Q.NetworkID) FROM
(SELECT DISTINCT Series.SeriesID, Series.SeriesName
FROM ((CollectedEpisodes INNER JOIN Episodes
ON CollectedEpisodes.EpisodeID = Episodes.EpisodeID)
INNER JOIN Series ON Series.SeriesID = Episodes.SeriesID)
INNER JOIN Network ON Network.NetworkID = Series.NetworkID)
Q
WHERE Q.NetworkID = qryDistinctNetworks.NetworkID) AS CountOfNetwork,
qryDistinctNetworks.Network, qryDistinctNetworks.NetworkID
FROM (SELECT DISTINCT Series.SeriesID, Series.SeriesName
FROM ((CollectedEpisodes INNER JOIN Episodes
ON CollectedEpisodes.EpisodeID = Episodes.EpisodeID)
INNER JOIN Series ON Series.SeriesID = Episodes.SeriesID)
INNER JOIN Network ON Network.NetworkID = Series.NetworkID)
qryDistinctNetworks;


You're not a "pest".... these types of queries get pretty complicated when
written as a single SQL statement. Good eyes on your part!
--

Ken Snell
<MS ACCESS MVP>
 
S

Spektre

Ken,

I a trying to implement the query and getting an error message

Do you have any more time to help on this?

Basically the database I must work with (unable to change the schema) has
poor field naming. The structure is the same as I made up for the board
questions (which I tried to make easy to understand to take the naming out
of the equation) but the names are actually as so:

Table: publisher
Fields: ID, PubName

Table: series
Fields: ID, Bk_Name, PubID

Table: issues
Fields: ID, SeriesID

Table: collection
Fields: ID, IssueID

In each case "ID" is a key/auto increment field, and the next lower table
links up thru the last field listed.

Based on the syntax of the query you thoughtfully provided I came up with

SELECT (SELECT COUNT(Q.ID) FROM (SELECT DISTINCT series.ID, series.Bk_Name
FROM ((collection INNER JOIN issues ON collection.IssueID = issues.ID) INNER
JOIN series ON series.ID = issues.SeriesID) INNER JOIN publishers ON
publishers.ID = series.PubID) Q WHERE Q.ID = qryDistinctPublishers.ID) As
CountOfPublisher, qryDistinctPublishers.PubName, qryDistinctPublishers.ID
FROM (SELECT DISTINCT series.ID, series.Bk_Name FROM ((collection INNER JOIN
issues ON collection.IssueID = issues.ID) INNER JOIN series ON series.ID =
issues.SeriesID) INNER JOIN publishers ON publishers.ID = series.PubID)
qryDistinctPublishers

When I try to execute this query I recieve the error message:

"No value given for one or more required parameters."

Any ideas on what i am doing wrong would be greatly appreciated.

Many thanks
 
K

Ken Snell \(MVP\)

Try this:

SELECT (SELECT COUNT(Q.SID) FROM (SELECT DISTINCT series.ID AS SID
FROM ((collection INNER JOIN issues ON collection.IssueID = issues.ID) INNER
JOIN series ON series.ID = issues.SeriesID) INNER JOIN publishers ON
publishers.ID = series.PubID) Q WHERE Q.SID = qryDistinctPublishers.S_ID) As
CountOfPublisher, qryDistinctPublishers.Pub_Name, qryDistinctPublishers.S_ID
FROM (SELECT DISTINCT series.ID AS S_ID, publishers.PubName AS Pub_Name
FROM ((collection INNER JOIN
issues ON collection.IssueID = issues.ID) INNER JOIN series ON series.ID =
issues.SeriesID) INNER JOIN publishers ON publishers.ID = series.PubID)
qryDistinctPublishers


I've given an alias to the "series.ID" field in the primary FROM clause so
that you can reference it as a single name in the subquery. I also added
alias for "series.ID" field in the subquery for similar purpose. I also
removed the unneeded reference to "series.Bk_Name" field in the subquery and
primary FROM clause query; I also added "publishers.PubName" field (with an
alias) to the main FROM clause query so that you can display it in the
overall query.

The addition of aliases may not be needed in the final structure, but leave
them in until you get the query to work. Then you can experiment with taking
them out and using just field name in the field display list to see if all
works ok without them.
--

Ken Snell
<MS ACCESS MVP>
 
S

Spektre

Hooray...Intellisense had added some characters. The query now runs!!

Can I ask, what is the expected output? IE how many colums are in each row
and what are they?
 
S

Spektre

Hmmm,

I followed the commas and think I understand the output.

I am not getting what I expect.

The output has 898 records. I would expect 6-8 records each denoting a
count of collection entries for a given publisher. For reference, there are
about 8000 collection entries in the database, from about 6-8 publishers (I'd
have to do a quick by hand count to be more accurate).
 
K

Ken Snell \(MVP\)

This query will have three fields in the output (CountOfPublisher, Pub_Name,
and S_ID), and I'd expect one record for each unique combination of
series.ID AS S_ID and publishers.PubName, based on the way you restructured
the query's contents:

SELECT (SELECT COUNT(Q.SID) FROM (SELECT DISTINCT series.ID AS SID
FROM ((collection INNER JOIN issues ON collection.IssueID = issues.ID) INNER
JOIN series ON series.ID = issues.SeriesID) INNER JOIN publishers ON
publishers.ID = series.PubID) Q WHERE Q.SID = qryDistinctPublishers.S_ID) As
CountOfPublisher, qryDistinctPublishers.Pub_Name, qryDistinctPublishers.S_ID
FROM (SELECT DISTINCT series.ID AS S_ID, publishers.PubName AS Pub_Name
FROM ((collection INNER JOIN
issues ON collection.IssueID = issues.ID) INNER JOIN series ON series.ID =
issues.SeriesID) INNER JOIN publishers ON publishers.ID = series.PubID)
qryDistinctPublishers


Same for the other query that is similar in scope.

As for the number of rows that you're getting, it's hard to comment about
that because I haven't seen actual data. But with the change in the
"structure", perhaps you can state in words what the individual records
should be showing for your desired query's output.
 
S

Spektre

In the output, each CountofPublisher is 1.

What I am hoping for is one record for each publisher.ID (PubName names may
be weird and not be unique) that has a corresponding entry in collection.
Using the example "data" in the first post I would expect the output form
this query to be something like:

3, ABC, 3080
1, NBC, 2000
2, CBS, 1025
5, Fox, 823
4, PBS, 490
<end of output>

Which represent publisher.ID, publisher.PubName, "COUNT(entries in
collection which come from that network)"

Similarly for the 2nd query the output would be:

126, Star Trek, 72
2034, Cheers, 53
3100, Roseanne, 29
etc.

Which are series.ID, series.Bk_Name, and "COUNT(entries in collection which
come from that series)"

The last query would hopefully output

10035, 2034, Cheers
10036, 2034, Cheers
2126, 126, Star Trek
etc.

whose enties are:

issue.ID, series.ID, series.Bk_Name

and represent episodes not yet collected but for which you have one of the
series collected (unique issue.ID's NOT in collection for series which have
at least one entry in collection)

I hope that makes sense.
 
S

Spektre

Ken,

Getting closer. The following query gives the Count I am looking for for
ONE publisher.

sqlStr = "SELECT COUNT(publishers.ID) FROM (((collection INNER JOIN issues
ON collection.IssueID = issues.ID) INNER JOIN series ON issues.SeriesID =
series.ID) INNER JOIN publishers ON publishers.ID = series.PubID) WHERE
publishers.ID = 54"

I'd like this be extended to include a seperate record for each publisher,
and to include the Publisher.ID, and Publisher.PubName fields in each record.
 
S

Spektre

Ken,

SUCCESS!!

sqlStr = "SELECT publishers.PubName, COUNT(*) FROM (((collection INNER JOIN
issues ON collection.IssueID = issues.ID) INNER JOIN series ON
issues.SeriesID = series.ID) INNER JOIN publishers ON publishers.ID =
series.PubID) GROUP BY publishers.PubName ORDER BY COUNT(*)"

Works great...

Could still use some help on the "Episodes not collected" query
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top