Newbie: Help with basic SQL

  • Thread starter Thread starter steve
  • Start date Start date
S

steve

Hello,

I dont know wether I have spreadsheet-ditis but I just can`t figure out a
way to do the following:


- Two tables , linked via a StationID.

One tblAQMeasurements has air-quality measurements for various pollutants
(ozone, mercury , etc.)
datetime | StationID | PolutantCode | value

where value is the measurement.

The other table is similar , but for meteorological data, not pollutants.
Like Temperature, humidity etc.
tblMetMeasurements
datetime | StationID | ParameterCode | value

Now , I want to construct a querry in SQL (TSQL or Access is not important,
I will figure out the syntactical details) that will extract *on the same
rows* information about more that one pollutants and/or more than one
parameters (of course for the same station). The only, "linking" variable
will be the datetime.
For example:

datetime - Temperature measurement - Ozone measurement - Mercury measurement

remember the first is a Meteorological parameter (tblMetMeasurement) the
other two pollutants (tblAQMeasurements).

The reason for this is to eventually extract to Excel or something and do
statistics. Is there a way to do this?
I have a feeling the answer will be either too obvious (embarassing) OR
no-way. Am I missing completely the idea of relational databases ?
I dont want a Union, i.e. have the data "in series". I want them "next to
each other".
For a specific station there are a few pollutants (PollutantCode) and a few
meteorological parameters (ParameterCode) along with their hourly
measurements.
BTW, I think the design is fine, the tables are not directly related by
SationID but indirectly. What i show here is a small example for you to
understand. If it can be done here it can be done in my case.

Before you ask: There are *NO* guarantees that I *will* have data for all
three parameters-pollutants on the *same* datetime ! In this case I just
dont display them. Do not assumme that i have all measurements for a
specific period of time.

TIA for any help/suggestions
-steve
 
Assuming that the keys are (datetime, stationid, pollutantcode) and
(datetime, stationid, parametercode). Try:

SELECT COALESCE(A.dt,B.dt) AS dt, A.temp, humid, ozone, mercury
FROM
(SELECT dt,
MIN(CASE WHEN parametercode = 'temp' THEN value END) AS temp,
MIN(CASE WHEN parametercode = 'humid' THEN value END) AS humid
FROM tblMetMeasurements
GROUP BY dt) AS A
FULL JOIN
(SELECT dt,
MIN(CASE WHEN pollutantcode = 'ozone' THEN value END) AS ozone,
MIN(CASE WHEN pollutantcode = 'mercury' THEN value END) AS mercury
FROM tblAQMeasurements
GROUP BY dt) AS B
ON A.dt = B.dt
 
Actually I left out StationID altogether, as you did also. I expect you
will want to add StationID to the join ON criteria and the GROUP BY,
otherwise I'm not sure what result you would require from this. Without
SationID you'll just get one set of values for each datetime (the
minimum values for each measurement), which probably isn't very useful:

SELECT COALESCE(A.dt,B.dt) AS dt,
COALESCE(A.stationid,B.stationid) AS stationid,
A.temp, A.humid, B.ozone, B.mercury
FROM
(SELECT dt, stationid,
MIN(CASE WHEN parametercode = 'temp' THEN value END) AS temp,
MIN(CASE WHEN parametercode = 'humid' THEN value END) AS humid
FROM tblMetMeasurements
GROUP BY dt, stationid) AS A
FULL JOIN
(SELECT dt, stationid,
MIN(CASE WHEN pollutantcode = 'ozone' THEN value END) AS ozone,
MIN(CASE WHEN pollutantcode = 'mercury' THEN value END) AS mercury
FROM tblAQMeasurements
GROUP BY dt, stationid) AS B
ON A.dt = B.dt

If I have't answered your question then maybe you could post some
sample data to explain better. The best way to do that is to post
CREATE TABLE statements followed by INSERT statements. Also show us
your required end result.
 
I think with this example it will be absolutely clear:

tblMeteoMeasurements
date_time METEOid ParameterCode Measurement
8/1/2004 11:00:00 PM 6106000 156 19
8/2/2004 11:00:00 PM 6106000 074 162
8/2/2004 11:00:00 PM 6106000 076 9
8/2/2004 11:00:00 PM 6106000 078 210
8/2/2004 11:00:00 PM 6106000 080 74
8/2/2004 11:00:00 PM 6106000 156 20
8/3/2004 11:00:00 PM 6106000 074 -999
9/7/1995 1:00:00 AM 70132G9 076 15
9/7/1995 1:00:00 AM 70132G9 078 137
9/7/1995 1:00:00 AM 70132G9 080 54
9/7/1995 1:00:00 AM 70132G9 156 25
.........


tblAQMeasurements
date_time METEOid PollutantCode Measurement
8/1/2004 11:00:00 PM 6106000 023 5
8/2/2004 11:00:00 PM 6106000 002 12.3
8/2/2004 11:00:00 PM 6106000 015 7
8/2/2004 11:00:00 PM 6106000 004 2
8/2/2004 11:00:00 PM 6106000 015 8
8/2/2004 11:00:00 PM 6106000 004 1
8/3/2004 11:00:00 PM 6106000 015 5
9/7/1995 1:00:00 AM 70132G9 004 1
9/7/1995 1:00:00 AM 70132G9 015 1
9/7/1995 1:00:00 AM 70132G9 002 0.6
9/7/1995 1:00:00 AM 70132G9 015 0
.........



Now, On 8/1/2004 11:00:00 PM I want the following result from my query :

date_time METEOid PollutantCode Meas.
PollutantCode Meas. ParameterCode Meas. ParameterCode Meas.
8/1/2004 11:00:00 PM 6106000 023 5
002 12.3 156 19 074
162

..... etc.

(the spaces are mine and used to group things together for visual purposes)



Parameters 156 and 074 (Temperature and Humidity for example) and their
measurements,
ALONGSIDE 023 and 002 (ozone and mercury) and their measurements for the
*same* datetime AND same station

The goal of this is to have for the same time (so i can do statistics later
on!)
the value of e.g. the temperature, and humidity and ozone and mercury WHEN
the
temperature was above 2.0 and the mercury was above 9.

which means that for a specific station (or more if i decide so in the
conditions),
I have ***for the same Datetime*** two different Pollutants (023, 002) and
their
measurements on that date AND two Meteorological parameters with their
measurements.

I think the best way way to summarize is to ask: How can in SQL, extract
the same field
that i have already extracted, BUT with a different value AND *on the same
row* as itself.


What is also important are the various conditions. Notice how one
measurement is -999, that means
that it`s not valid. In such a case for example i will disregard the whole
row, i.e. ALL other values,
so that i can perform my statistics only when all values are present AND
satisfy certain conditions.
(WHERE part of the querry)

I hope my question is clearer now. Is there a simple way to do this so that
I can just add the various
restrictions/conditions later on on a "separate" WHERE part?

(The querries are supposed to be generated by a VB interface and therefore
i have to find a logical grouping
in order to translate from the interface into the SQL.)

I really appreciate your efforts!
TIA
-steve
 
Back
Top