Difficult Query

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

I have been asked to pull some summary data out of a table, and due to the
structure, can't seem to get it. Any help would be appreciated.

The fields in the table look something like this:
TestID TestCode Area CFU OrgID1 OrgID2 OrgID3 OrgID4 OrgID5 OrgID6 OrgID7
OrgID8 OrgID9 OrgID10 Notes

Each record represents a microbiology monitoring test. The fields OrgID1,
OrgID2, OrgID3 etc. contain the name of the organism that was identified from
that test. The record has room for up to ten organisms.

Here's the problem: I need to summarize how many times a particular organism
has come up, but it may be in different ID fields.

For example, if I wanted the number of times ""Bacillus spp." has come up:
Record 1 - OrgID1 = "Bacillus spp."
Record 1 - OrgID2 = "CNS"

Record 2 - OrgID1 = "Staph A."
Record 2 - OrgID2 = "Bacillus spp."

Record 3 - OrgID1 = "Staph A."
Record 3 - OrgID2 = "CNS"
Record 3 - OrgID3 = "Bacillus spp."

The answer I'm looking for in this example is two. But I can't seem to
design a query that will look for a particular organism across different
fields. I tried a select query that used ="Bacillus spp." or Is Null as the
critera for the OrgIDFields. But if it doesn't find "Bacillus spp." in the
OrgID1 field, then it won't find it in the OrgID2 field.

Help! I've also tried pivot tables, with the same luck.

Thanks!
Scott D. Collins
 
This isn't the prettiest way, but it worked for me:

SELECT Count(Table1.ID) AS CountOfID
FROM Table1
WHERE (((Table1.f1)="x")) OR (((Table1.f2)="x")) OR (((Table1.f3)="x")) OR ((
(Table1.f4)="x")) OR (((Table1.f5)="x"));

The way to acheive this in the query designer is to put each criteria on a
seperate row. This gives you the "or" effect.

Gina
 
The problem is that you have a spreadsheet, not a normalized table. Thus,
Access can't really help you. You can do a CountIF() in Excel easier than
you can do the count in Access.

If you were to normalize the data into a child table, then Access would be
the way to go.
 
In the query grid under Criteria, put Like "baci" by the first OrgID, put the
same thing, only on the next line for the next OrgID, and so on. Be sure you
put the criteria on a seperate line for each OrgID and it will work.
 
I'll go along with what others have said here, again assuming that you
have some control over the Tables and can restructure them.

Even if you don't, it's possible to run Queries that will produce Tables
structured the way people have suggested here (without the "repeating
groups" of [OrgID1], [OrgID2], &c., which have essentially the same kind
of data in them, just different values). I've suggested some examples
at the end of this message.
due to the



oldstonebuddha,

I see one general solution has been proposed earlier, but if you
have control or influence over this database, then the rest of my
post is in regards to that.

This table has repeating columns. This goes against the database
normalization rule called "1st Normal Form". (See the links I have
provided below for more infomration on database normalization.)

The table structure you have given is why you are experiencing
trouble retrieving the data you want.

Try:

Tables:

CREATE TABLE Oganisms
(OrganismID INTEGER
,CommonName TEXT(255)
,Description MEMO
,CONSTRAINT pk_MicroOrganisms
PRIMARY KEY (MicroOrganism)
)

CREATE TABLE Tests
(TestID INTEGER
,TestName TEXT(255)
,TestDescription MEMO
,CONSTRAINT pk_Test
PRIMARY KEY (TestID)
)

CREATE TABLE OrganismTests
(OrganismTestID INTEGER
,OrganismID INTEGER
,TestID INTEGER
,TestDate DATETIME
,CONSTRAINT pk_OrganismTests
PRIMARY KEY (OrganismTestID)
,CONSTRAINT fk_OrganismTests_Organisms_OrganismID
FOREIGN KEY (OrganismID)
REFERENCES Organisms (OrganismID)
,CONSTRAINT fk_OrganimsTests_Tests_TestID
FOREIGN KEY (TestID)
REFERENCES Tests (TestID)
)




particular organism


come up:

Untested (sorry, but any mistakes should only be typos, the idea is
the correct one).

SELECT COUNT(OT1.OrganismID) AS CountOfOrganism
FROM (OrganismTests AS OT1
INNER JOIN
Tests AS T1
ON OT1.TestID = T1.TestID)
INNER JOIN
Organisms AS O1
ON OT1.OrganismID = O1.OrganismID
WHERE O1.CommonName = "Bacillus sup."

Note: Technically, the JOIN to Tests isn't required here, but I
included it anyway based on the probability that information from
Tests might also be needed for similar queries (and so its inclusion
is just an overall part of the example).





seem to


Null as the


spp." in the




Database Normalization:

---------------------------

Googling on Database Normalization will provide many good articles.
Most of them are written for other database professionals (the
actual written rules of Database Normalization will throw you for a
loop if you don't know exactly what they are talking about).

Database Normalization is a process of decomposition, of breaking
down information until you get down the basic elements that describe
entities, and making sure that the information kept about an entity
(i.e. "table") *really* belongs to it, and not to something else (or
somewhere else).

---------------------------

Database Normalization:

Basics:

About.com
http://databases.about.com/od/specificproducts/a/normalization.htm


Intermediate:

MySQL's website:
http://dev.mysql.com/tech-resources/articles/intro-to-normalization.html


Advanced:

Wikipedia:
http://en.wikipedia.org/wiki/Database_normalization


Very Advanced:

University of Texas:
(I quite like this whole site, since it has a handy menu on the
right
describing many important aspects of database normalization and
modeling.)
http://www.utexas.edu/its/windows/database/datamodeling/rm/rm7.html


Sincerely,

Chris O.

My suggestion (and bear in mind that there are many ways to organize
your information) would be to omit the [OrgID...] fields from your main
Table, so what's left might look like this:

[Test] Table Datasheet View:

TestID TestCode Area CFU Notes
------ -------- ---- ---- -----
1 Q 51 100 Eb
2 X 22 100 C#
3 R 69 100 G

I assume that [TestID] is a unique datum that may be used to identify a
record in this Table. If not, you might specify an Autonumber field to
do that. For my example, I'm using [TestID] as the primary
(identifying) key value.

I also pulled out the organisms' names into a separate Table, allowing
you to avoid having to fiddle with typing them. You can just look them
up in a List Box or Combo Box, and if you develop a desire to edit one
of the names, it appears only in this Table, so one change takes care of
all.

[Organisms] Table Datasheet View:

Organism_ID Organism
----------- -------------
58622514 Staph A.
1721486692 Bacillus spp.
1760227989 CNS

The remainder of your fields (the [OrgID3], &c., fields) can be put into
a Table that might look like this. The [TestID] field links the record
to the record in [Test] to which it belongs. The [Organism_ID] field
links to a record in [Organisms].

[Data] Table Datasheet View:
Data_ID TestID Organism_ID OrgNum
----------- ------ ----------- ------
-1542185487 3 1760227989 2
-346997942 2 58622514 1
-343950416 1 1760227989 2
-330245753 1 1721486692 1
132486925 2 1721486692 2
744851132 3 58622514 1
1907345758 3 1721486692 3

The value of [TestID], I assume, means something to you, so I left it
alone, but those [Organism_ID] values are not pretty. I defined a
Lookup property on that field so that it would display the name of the
organism instead of the key value. (But the key value is what's
actually stored in the Table. If this is confusing, don't use Lookup
properties.)

[Data] Table Datasheet View, with Lookup property set on [Organism_ID]:

Data_ID TestID Organism_ID OrgNum
----------- ------ ------------- ------
-1542185487 3 CNS 2
-346997942 2 Staph A. 1
-343950416 1 CNS 2
-330245753 1 Bacillus spp. 1
132486925 2 Bacillus spp. 2
744851132 3 Staph A. 1
1907345758 3 Bacillus spp. 3

Those are all the Tables needed for my example. My first Query ties the
Tables together and lists relevant fields for each test. (I omitted
some to save space here, but you may include many more than I show here.)

[Q_ListAll] SQL:

SELECT Test.TestID, Test.TestCode, Test.Area,
"OrgID" & [OrgNum] AS ID, Organisms.Organism,
Data.Data_ID, Organisms.Organism_ID
FROM Test INNER JOIN (Organisms INNER JOIN Data
ON Organisms.Organism_ID = Data.Organism_ID)
ON Test.TestID = Data.TestID
ORDER BY Test.TestID, "OrgID" & [OrgNum];

The [Data_ID] and [Organism_ID] fields are included not for people to
look at, but because they're needed in subsequent Queries.

Note that this Query looks something like the list you posted in your
message.

[Q_ListAll] Query Datasheet View:

Test Test Area ID Organism Data_ID Organism_ID
ID Code
---- ---- ---- ------ ------------- ----------- -----------
1 Q 51 OrgID1 Bacillus spp. -330245753 1721486692
1 Q 51 OrgID2 CNS -343950416 1760227989
2 X 22 OrgID1 Staph A. -346997942 58622514
2 X 22 OrgID2 Bacillus spp. 132486925 1721486692
3 R 69 OrgID1 Staph A. 744851132 58622514
3 R 69 OrgID2 CNS -1542185487 1760227989
3 R 69 OrgID3 Bacillus spp. 1907345758 1721486692

Now that the data are organized (without repeating groups) to make them
easy to analyze, you can count the occurrences easily. CAUTION: I'm not
sure exactly how you intended to count them, so you may need to alter
this Query to give you the results you want. For example, you said that
you wanted the number reported for "Bacillus spp." to be 2, but since it
occurs 3 times in your sample data, this version of the Query displays 3.

[Q_TestsByOrganism] SQL:

SELECT Organisms.Organism,
Count(Q_ListAll.Data_ID) AS CountOfData_ID
FROM Organisms INNER JOIN Q_ListAll
ON Organisms.Organism_ID = Q_ListAll.Organism_ID
GROUP BY Organisms.Organism
ORDER BY Organisms.Organism;

This just counts each species once for each time it appears in any
column in the original Table, even if it's listed twice in the same test.

[Q_TestsByOrganism] Query Datasheet View:

Organism CountOfData_ID
-------------- --------------
Bacillus spp. 3
CNS 2
Staph A. 2

Another Query that you might find helpful would kind of reproduce what I
think your Table looks like right now (including the repeating groups),
but since it's a Query and not a Table, the repeating groups don't
present an analysis headache.

[Q_ListAll_Crosstab] SQL:

TRANSFORM First(Q_ListAll.Organism)
AS FirstOfOrganism
SELECT Q_ListAll.TestID, Q_ListAll.Area
FROM Q_ListAll
GROUP BY Q_ListAll.TestID, Q_ListAll.Area
PIVOT Q_ListAll.ID;

The results look like this:

[Q_ListAll_Crosstab] Query Datasheet View:

TestID Area OrgID1 OrgID2 OrgID3
------ ---- ------------- ------------- -------------
1 51 Bacillus spp. CNS
2 22 Staph A. Bacillus spp.
3 69 Staph A. CNS Bacillus spp.

You might want to abbreviate the names, so that you can include all the
columns on one line. If you use this Query as the basis for a Report,
that shouldn't be a problem, as you could put some of the fields on
successive lines and include blank space between records.

To return to an earlier comment, if you do NOT have control over the
structure of your Table, and all you can do is to read it, it's not
difficult to write Queries which will split out those repeating fields.
If you would like to do that and run into trouble, there are plenty of
people here who can help you.

-- Vincent Johns <[email protected]>
Please feel free to quote anything I say here.
 
oldstonebuddha said:
I have been asked to pull some summary data out of a table, and due to the
structure, can't seem to get it. Any help would be appreciated.

The fields in the table look something like this:
TestID TestCode Area CFU OrgID1 OrgID2 OrgID3 OrgID4 OrgID5 OrgID6 OrgID7
OrgID8 OrgID9 OrgID10 Notes

oldstonebuddha,

I see one general solution has been proposed earlier, but if you
have control or influence over this database, then the rest of my
post is in regards to that.

This table has repeating columns. This goes against the database
normalization rule called "1st Normal Form". (See the links I have
provided below for more infomration on database normalization.)

The table structure you have given is why you are experiencing
trouble retrieving the data you want.

Try:

Tables:

CREATE TABLE Oganisms
(OrganismID INTEGER
,CommonName TEXT(255)
,Description MEMO
,CONSTRAINT pk_MicroOrganisms
PRIMARY KEY (MicroOrganism)
)

CREATE TABLE Tests
(TestID INTEGER
,TestName TEXT(255)
,TestDescription MEMO
,CONSTRAINT pk_Test
PRIMARY KEY (TestID)
)

CREATE TABLE OrganismTests
(OrganismTestID INTEGER
,OrganismID INTEGER
,TestID INTEGER
,TestDate DATETIME
,CONSTRAINT pk_OrganismTests
PRIMARY KEY (OrganismTestID)
,CONSTRAINT fk_OrganismTests_Organisms_OrganismID
FOREIGN KEY (OrganismID)
REFERENCES Organisms (OrganismID)
,CONSTRAINT fk_OrganimsTests_Tests_TestID
FOREIGN KEY (TestID)
REFERENCES Tests (TestID)
)

Here's the problem: I need to summarize how many times a particular organism
has come up, but it may be in different ID fields.

For example, if I wanted the number of times ""Bacillus spp." has
come up:

Untested (sorry, but any mistakes should only be typos, the idea is
the correct one).

SELECT COUNT(OT1.OrganismID) AS CountOfOrganism
FROM (OrganismTests AS OT1
INNER JOIN
Tests AS T1
ON OT1.TestID = T1.TestID)
INNER JOIN
Organisms AS O1
ON OT1.OrganismID = O1.OrganismID
WHERE O1.CommonName = "Bacillus sup."

Note: Technically, the JOIN to Tests isn't required here, but I
included it anyway based on the probability that information from
Tests might also be needed for similar queries (and so its inclusion
is just an overall part of the example).


The answer I'm looking for in this example is two. But I can't seem to
design a query that will look for a particular organism across different
fields. I tried a select query that used ="Bacillus spp." or Is Null as the
critera for the OrgIDFields. But if it doesn't find "Bacillus spp." in the
OrgID1 field, then it won't find it in the OrgID2 field.

Help! I've also tried pivot tables, with the same luck.

Thanks!
Scott D. Collins


Database Normalization:

---------------------------

Googling on Database Normalization will provide many good articles.
Most of them are written for other database professionals (the
actual written rules of Database Normalization will throw you for a
loop if you don't know exactly what they are talking about).

Database Normalization is a process of decomposition, of breaking
down information until you get down the basic elements that describe
entities, and making sure that the information kept about an entity
(i.e. "table") *really* belongs to it, and not to something else (or
somewhere else).

---------------------------

Database Normalization:

Basics:

About.com
http://databases.about.com/od/specificproducts/a/normalization.htm


Intermediate:

MySQL's website:
http://dev.mysql.com/tech-resources/articles/intro-to-normalization.html


Advanced:

Wikipedia:
http://en.wikipedia.org/wiki/Database_normalization


Very Advanced:

University of Texas:
(I quite like this whole site, since it has a handy menu on the
right
describing many important aspects of database normalization and
modeling.)
http://www.utexas.edu/its/windows/database/datamodeling/rm/rm7.html


Sincerely,

Chris O.
 
Dear Stone:

As others have said, your structure is the problem. Short of restructuring
the data, or as a step toward doing that, you can create a norlamized
appearance to this data with a "normalizing union query" and then you can
query the results of this normalizing query as though the data were
normalized in the first place. In addition, you can use a form of this
normalizing query to move the data into a properly constructed table
permanently.

For the purposes you show, you would need this:

SELECT TestID, TestCode, Area, CFU, 1 as Source, OrgID1
FROM YourTable
WHERE OrgID1 IS NOT NULL
UNION ALL
SELECT TestID, TestCode, Area, CFU, 2 as Source, OrgID2
FROM YourTable
WHERE OrgID2 IS NOT NULL
UNION ALL
SELECT TestID, TestCode, Area, CFU, 3 as Source, OrgID3
FROM YourTable
WHERE OrgID3 IS NOT NULL
UNION ALL
SELECT TestID, TestCode, Area, CFU, 4 as Source, OrgID4
FROM YourTable
WHERE OrgID4 IS NOT NULL
UNION ALL
SELECT TestID, TestCode, Area, CFU, 5 as Source, OrgID5
FROM YourTable
WHERE OrgID5 IS NOT NULL
UNION ALL
SELECT TestID, TestCode, Area, CFU, 6 as Source, OrgID6
FROM YourTable
WHERE OrgID6 IS NOT NULL
UNION ALL
SELECT TestID, TestCode, Area, CFU, 7 as Source, OrgID7
FROM YourTable
WHERE OrgID7 IS NOT NULL
UNION ALL
SELECT TestID, TestCode, Area, CFU, 8 as Source, OrgID8
FROM YourTable
WHERE OrgID8 IS NOT NULL
UNION ALL
SELECT TestID, TestCode, Area, CFU, 9 as Source, OrgID9
FROM YourTable
WHERE OrgID9 IS NOT NULL
UNION ALL
SELECT TestID, TestCode, Area, CFU, 10 as Source, OrgID10
FROM YourTable
WHERE OrgID10 IS NOT NULL

Here, I have added a column "Source" which keeps the original order to the
OrgIDs. This can be useful. It also makes the rows unique. If someone has
entered the same OrgID in OrgID4 and OrgID9, they will be unique. While
this may be an error, it is not necessarily the case you want this to cause
problems at this time.

From the above query, a Count() of each bacillus is simple:

SELECT OrgID, COUNT(*)
FROM TheQuery
GROUP BY OrgID

Note that, if an OrgID occurs twice in the same original row, it will be
counted twice.

Please let me know if this helped, and if I can be of any other assistance.

Tom Ellison
 
As in, "for your infomration, the table is not norlamized." LOL.

Not that's furmy. (sight gag)

Tom Ellison
 
Tom Ellison said:
I would have preferred I had spelled "norlamized" as "normalized" OK?

Tom Ellison

Tom Ellison,

That's alright, I spelled "information" as "infomration" in this
same thread. ;)


Sincerely,

Chris O.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top