Inner join on table with NULL's

C

Carl Colijn

Hi group,

I want to perform an inner join between two tables.

The first (named 'Data') has some data columns and two classification
columns ('Period' = Date/Time, and 'SequenceNr' = Long) that identify the
type of record. If the record is actual, then the classification columns
contains NULL's. If the record is historical, then the record contains the
correct date for the period and a sequence number (there can be more than
one set of historical records per period).

The second table (named 'Conf_History') is used as a configuration table to
determine the period and sequence nr. to show throughout the database's
forms. It thus contains the same two columns 'Period' and 'SequenceNr'. It
holds exactly one record.

Now comes the tricky part. I want to base my forms on a RWOP query to
access the data in the backend database, but they should only show the
correct set of records (according to the 'Conf_History' table). My first
try was to use an inner join in the query:
SELECT Data.*
INNER JOIN Conf_History
ON Data.Period = Conf_History.Period
AND Data.SequenceNr = Conf_History.SequenceNr
but this didn't work because NULL = NULL doesn't work. The next attempt
was:
SELECT Data.*
FROM Data, Conf_History
WHERE Data.Period = Conf_History.Period
AND Data.SequenceNr = Conf_Historiy.SequenceNr
OR (Data.Period IS NULL AND Conf_Historiy.Period IS NULL)
but this seems inefficient (making a cross join first combined with a more
convoluted WHERE clause). And then the result set of these queries is not
appendable nor modifyable (a requirement for my forms).

The alternative I came up with is to use a custom VBA function to check each
record:
SELECT *
FROM Data
WHERE IsCorrectPeriod(Period, SequenceNr)
in combination with the VBA routine:
Private g_dPeriod As Date ' These get initialized on DB
startup
Private g_nSequenceNr As Long ' These get initialized on DB startup
Private g_bIsCurrent As Boolean ' These get initialized on DB startup
Public Function IsCorrectPeriod(dPeriod As Variant, nSequenceNr As
Variant)
If IsNull(dPeriod) And g_bIsCurrent Then
IsCorrectPeriod = True
Else
IsCorrectPeriod = (dPeriod = g_dPeriod) And _
(g_nSequenceNr = nSequenceNr)
End If
End Function
but I can imagine this gives even more overhead... The resultset of the
query is appendable and modifyable however, exacty as I wanted.

And now for the $1.000.000 question: is there an alternative that given me
better performance and/or less complicated queries?

Thanks in advance,
Carl Colijn
 
G

Guest

I dont know if this might help at all but i had to do something similar to
that, and i used LEFT join and used for criteria: NULL. for me this returned
all the results with NULL in that column, and none of the results with
anything in that column. like i said this worked for me and i am not quite
sure if it will work for you....
 
C

Carl Colijn

JKarchner said:
I dont know if this might help at all but i had to do something
similar to that, and i used LEFT join and used for criteria: NULL.
for me this returned all the results with NULL in that column, and
none of the results with anything in that column. like i said this
worked for me and i am not quite sure if it will work for you....

Hi JKarchner,

Unfortunately this solution doesn't work in my situation. In my case I do
not always want to select the records with a NULL in them, but conditionally
select either the NULL-containing records _or_ the records containing a
certain value. The condition is stored in another record in another table,
which holds the correct value to check for (NULL or a real value).
Therefore my first thought was to use a JOIN to select the records with the
right value in them, either NULL or that certain value.

The problem is that Access (and I believe all SQL implementations) cannot
deal with NULL values at all when using a JOIN, so that "table1.column =
table2.column" cannot be used as a join condition if they both contain NULL
(NULL <> NULL in SQL, by design I believe).

Anyway, thanks for the suggestion,
Kind regards,
Carl Colijn
 
K

Ken Snell \(MVP\)

The problem is that Access (and I believe all SQL implementations) cannot
deal with NULL values at all when using a JOIN, so that "table1.column =
table2.column" cannot be used as a join condition if they both contain
NULL (NULL <> NULL in SQL, by design I believe).

Anyway, thanks for the suggestion,
Kind regards,
Carl Colijn


The "trick" in this case is to use "nonequijoin" utilizing the Nz function
to replace a Null value with a "unique" value that will tell you the value
is Null. For example, suppose that the field contains only positive
number -- you could use a query like this:

SELECT T1.*, T2.*
FROM Table1 AS T1
INNER JOIN Table2 AS T2
WHERE Nz(T1.FieldName, -100) =
Nz(T2.FieldName, -100);

Similar steps could be used for text:

SELECT T1.*, T2.*
FROM Table1 AS T1
INNER JOIN Table2 AS T2
WHERE Nz(T1.FieldName, "MyNullValue") =
Nz(T2.FieldName, "MyNullValue");
 
C

Carl Colijn

Ken said:
The "trick" in this case is to use "nonequijoin" utilizing the Nz
function to replace a Null value with a "unique" value that will tell
you the value is Null. For example, suppose that the field contains
only positive number -- you could use a query like this:

SELECT T1.*, T2.*
FROM Table1 AS T1
INNER JOIN Table2 AS T2
WHERE Nz(T1.FieldName, -100) =
Nz(T2.FieldName, -100);

Similar steps could be used for text:

SELECT T1.*, T2.*
FROM Table1 AS T1
INNER JOIN Table2 AS T2
WHERE Nz(T1.FieldName, "MyNullValue") =
Nz(T2.FieldName, "MyNullValue");

Hi Ken,

I thought about that too, substituting something like #1/1/1600# for the
'Period' value if it's empty (#1/1/1600# being a 'safe' value in my case).
The problem is that the result of a SELECT query based on an INNER JOIN
isn't updateable and/or appendable, which makes it unuseable in my case...
Therefore I guess all JOINs are out of the window then...

However, when I supply the record condition via a VBA function (that
compares the record's values to the values in the other table via code), the
resulting recordset _is_ updateable. It does mean, however, that the VBA
function gets called for each and every record in my 'left' table to be
joined. Does calling a VBA function this way produce much (runtime)
overhead? The lookup time for the value to compare to can in my case be
reduced by properly caching that value in the VBA function itself, but the
VBA calls remain.

I also asked because I hoped there would be an easier/more manageable
(SQL-only) solution.

Anyway, thanks for the answer,
Carl Colijn
 
K

Ken Snell \(MVP\)

Calling a user-defined function from a query will definitely slow down the
query -- no way to escape that.

INNER JOIN queries usually are updatable -- if yours is not, likely it's
because the query is not selecting the primary key fields from the tables to
be updated, or the query cannot identify a unique record to be updated. The
join in and of itself is probably not the issue.

Post your entire SQL statement and perhaps we can point out how to change it
to make it updatable.
 
C

Carl Colijn

Ken said:
Calling a user-defined function from a query will definitely slow
down the query -- no way to escape that.

INNER JOIN queries usually are updatable -- if yours is not, likely
it's because the query is not selecting the primary key fields from
the tables to be updated, or the query cannot identify a unique
record to be updated. The join in and of itself is probably not the
issue.
Post your entire SQL statement and perhaps we can point out how to
change it to make it updatable.

Hi Ken,

The tables are indexed on all sorts of columns, but they do not have primary
keys defined.

A part of the real-world example follows:
Table Conf_Personnel:
PersonnelID: text (indexed, duplicates OK)
Name: text (indexed, duplicates OK)
ChefID: text
LastModified: date
Period: date (indexed, duplicates OK)
SequenceNr: long
Table Conf_History:
Period: date
SequenceNr: long
Query Retrieve_Correct_Personnel:
SELECT Conf_Personnel.*
FROM Conf_Personnel
INNER JOIN Conf_History
ON Conf_Personnel.Period = Conf_History.Period
AND Conf_Personnel.SequenceNr = Conf_History.SequenceNr
WITH OWNERACCESS OPTION;

The tables are located in a backend database, and the query is located in a
frontend database. The query is a RWOP query; the current user has
restricted rights on the backend database. I'm using Access 2003, but this
application needs to be able to run under Access 2000 as well. The
resulting dataset is not updateable nor expandable.

Table Conf_Personnel contains a set of personnel records. The database
contains also a lot of data tables that reference the personnel table. Each
time the user enters data, it is added to the 'current' data set, and thus
uses the 'current' personnel records. From time to time the database
administrator will close the current time period and the database needs to
flag all the 'current' data as being 'historical' (using the period and
sequence nr given by the admin). But the set of personnel records then also
needs to be duplicated and the copied set will be flagged with the same
period and sequence nr. flags as used for the data. This way the personnel
records used for a certain time period will be retained. From time to time
these historical datasets (real data and configuration data) will be
exported to separate historical databases for safekeeping. So all the data
in the datasets in the historical databases contain only one certain
Period/SequenceNr flag, but the 'current' database may contain different
sets, each having a different Period/SequenceNr flag.
Table Conf_History contains exactly one record. This record either contains
a valid Period/SequenceNr combination (a real date and nr.) to denote the
database contains historical information from that period, or two NULL's to
denote the database contains actual data.

Because the records in the Personnel table need to be duplicated (but with a
Period/SequenceNr stamp in stead of NULL's), none of the columns in the
Personnel table can be marked as the primary key.

When, for testing purposes, I modify the query to e.g.
SELECT *
FROM Conf_Personnel
WHERE Period IS NULL
AND SequenceNr IS NULL
WITH OWNERACCESS OPTION;
the resulting recordset is updateable and expandable. Same goes when I let
a VBA function perform the match checking against the sole record in the
Conf_History table;
SELECT *
FROM Conf_Personnel
WHERE IsCorrectPeriod(Period, SequenceNr)
WITH OWNERACCESS OPTION;

But I believe your comment about needing to select the primary key put me on
the right track. When I loosened the JOIN to only check for the Period
column and also marked the 'Period' column of the 'Conf_History' table to be
the primary key for that table, the query returned an updateable and
expandable record set! When I added the other join condition (on
SequenceNr) again, the party was over however, and the recordset was
read-only again...

Any thoughts?

For now I have a solution working by using the IsCorrectPeriod VBA function
check out each record for inclusion from within the query. Since the
configuration data is not that big of a dataset (500 records max), it is a
feasable solution (I didn't notice any performance effect at least). But
this has become a bit of an academic question for me now: can it be done
without resorting to VBA? I'd rather strip out the VBA function and have
everything done in SQL, even if only for maintainance purposes and not
runtime performance.

Thanks for your effort,
Kind regards,
Carl Colijn
 
C

Carl Colijn

Carl said:
Hi Ken,

The tables are indexed on all sorts of columns, but they do not have
primary keys defined.

A part of the real-world example follows:
Table Conf_Personnel:
PersonnelID: text (indexed, duplicates OK)
Name: text (indexed, duplicates OK)
ChefID: text
LastModified: date
Period: date (indexed, duplicates OK)
SequenceNr: long
Table Conf_History:
Period: date
SequenceNr: long
Query Retrieve_Correct_Personnel:
SELECT Conf_Personnel.*
FROM Conf_Personnel
INNER JOIN Conf_History
ON Conf_Personnel.Period = Conf_History.Period
AND Conf_Personnel.SequenceNr = Conf_History.SequenceNr
WITH OWNERACCESS OPTION;

The tables are located in a backend database, and the query is
located in a frontend database. The query is a RWOP query; the
current user has restricted rights on the backend database. I'm
using Access 2003, but this application needs to be able to run under
Access 2000 as well. The resulting dataset is not updateable nor
expandable.
Table Conf_Personnel contains a set of personnel records. The
database contains also a lot of data tables that reference the
personnel table. Each time the user enters data, it is added to the
'current' data set, and thus uses the 'current' personnel records. From
time to time the database administrator will close the current
time period and the database needs to flag all the 'current' data as
being 'historical' (using the period and sequence nr given by the
admin). But the set of personnel records then also needs to be
duplicated and the copied set will be flagged with the same period
and sequence nr. flags as used for the data. This way the personnel
records used for a certain time period will be retained. From time
to time these historical datasets (real data and configuration data)
will be exported to separate historical databases for safekeeping. So all
the data in the datasets in the historical databases contain
only one certain Period/SequenceNr flag, but the 'current' database
may contain different sets, each having a different Period/SequenceNr
flag. Table Conf_History contains exactly one record. This record either
contains a valid Period/SequenceNr combination (a real date and nr.)
to denote the database contains historical information from that
period, or two NULL's to denote the database contains actual data.

Because the records in the Personnel table need to be duplicated (but
with a Period/SequenceNr stamp in stead of NULL's), none of the
columns in the Personnel table can be marked as the primary key.

When, for testing purposes, I modify the query to e.g.
SELECT *
FROM Conf_Personnel
WHERE Period IS NULL
AND SequenceNr IS NULL
WITH OWNERACCESS OPTION;
the resulting recordset is updateable and expandable. Same goes when
I let a VBA function perform the match checking against the sole
record in the Conf_History table;
SELECT *
FROM Conf_Personnel
WHERE IsCorrectPeriod(Period, SequenceNr)
WITH OWNERACCESS OPTION;

But I believe your comment about needing to select the primary key
put me on the right track. When I loosened the JOIN to only check
for the Period column and also marked the 'Period' column of the
'Conf_History' table to be the primary key for that table, the query
returned an updateable and expandable record set! When I added the
other join condition (on SequenceNr) again, the party was over
however, and the recordset was read-only again...

Any thoughts?

For now I have a solution working by using the IsCorrectPeriod VBA
function check out each record for inclusion from within the query. Since
the configuration data is not that big of a dataset (500
records max), it is a feasable solution (I didn't notice any
performance effect at least). But this has become a bit of an
academic question for me now: can it be done without resorting to
VBA? I'd rather strip out the VBA function and have everything done
in SQL, even if only for maintainance purposes and not runtime
performance.
Thanks for your effort,
Kind regards,
Carl Colijn

I already worked around the problem in another way: in stead of using two
columns to indicate both Period (DateTime) and SequenceNr (Long), I now use
a single column PeriodID (Long). In this column I can easily encode both
the Period and SequenceNr without running out of values (32 bits is more
than enough for year (0-65535), month (0-15) and sequencenr (0-4095) ). I
also then made it a policy not to use NULL for the 'current' period, but
just use '0' instead.

This way I end up with only one column to INNER JOIN on in stead of two, and
by also making the (now only) column PeriodID in the 'Conf_History' table
the primary key for that table, the JOINs now return an updateable and
appendable recordset. Thus I can scrap the VBA function as well. Quite an
operation, but the result is worth-while (using NULL's was a bit of a design
flaw from the past that now also gets corrected in the process). Decoding
between the PeriodID and the Period/SequenceNr data it holds only has to be
done sporadically; most of the time the queries are just comparing values.

But to re-phrase the essence of the main question: can you INNER JOIN two
tables on two columns, and still end up with an updateable recordset?
Assigning primary keys to the tables is essential here, but of course you
cannot assign two primary keys to the 'right' table in the join for both
JOINed columns...

Anyway, thanks for the effort!
Kind regards,
Carl Colijn
 
K

Ken Snell \(MVP\)

< g > Not ignoring you, just extremely busy at the moment and unable to
provide a good reply... will be back as soon as possible.

--

Ken Snell
<MS ACCESS MVP>

< snipped >
 
K

Ken Snell \(MVP\)

The answer to your question -- "can you INNER JOIN two tables on two
columns, and still end up with an updateable recordset" -- is, "yes" or "no"
or "it depends". As you've discovered, the key can be in the table
structure, and it can be in the query structure. It's certainly possible to
do it, but it depends upon which table (main or child) is to be updated, and
whether unique record within that table(s) can be identified.

When a query is nonupdatable, it means that Jet cannot identify a unique
identifier for each record selected by the query. This can be overcome by
combining existing fields into a calculated field (which you did), by
eliminating some of the fields that give Jet the "indigestion" for "what is
the uniqueness" from the query, or by redoing your table structure.

I don't believe I can give you a firm, definitive answer; often, it's a
matter of test/trial/error. But here are a number of Microsoft articles
about this behavior (watch for line-wrapping):

When can I update data from a query?
http://msdn.microsoft.com/library/d...l/acconDeterminingWhenCanUpdateDataQueryS.asp

Harnessing the Power of Updatable Queries
http://msdn.microsoft.com/library/default.asp?url=/archive/en-us/dnaraccgen/html/msdn_harness.asp

ACC2000: Cannot Edit or Update Record on a Form
http://support.microsoft.com/default.aspx?scid=kb;en-us;209571&Product=acc

ACC2000: Status Bar Displays "Recordset Not Updateable" Message When You Try
to Update a Linked Table
http://support.microsoft.com/default.aspx?scid=kb;en-us;304179&Product=acc

INFO: Troubleshooting Errors That May Occur When You Update Data in Queries
and in Forms
http://support.microsoft.com/default.aspx?scid=kb;en-us;328828&Product=acc

ACC: Update Query Based on Totals Query Fails
http://support.microsoft.com/default.aspx?scid=kb;en-us;116142&Product=acc


--

Ken Snell
<MS ACCESS MVP>
 
C

Carl Colijn

Ken said:
The answer to your question -- "can you INNER JOIN two tables on two
columns, and still end up with an updateable recordset" -- is, "yes"
or "no" or "it depends". As you've discovered, the key can be in the
table structure, and it can be in the query structure. It's certainly
possible to do it, but it depends upon which table (main or child) is
to be updated, and whether unique record within that table(s) can be
identified.
When a query is nonupdatable, it means that Jet cannot identify a
unique identifier for each record selected by the query. This can be
overcome by combining existing fields into a calculated field (which
you did), by eliminating some of the fields that give Jet the
"indigestion" for "what is the uniqueness" from the query, or by
redoing your table structure.
I don't believe I can give you a firm, definitive answer; often, it's
a matter of test/trial/error. But here are a number of Microsoft
articles about this behavior (watch for line-wrapping):

When can I update data from a query?
http://msdn.microsoft.com/library/d...l/acconDeterminingWhenCanUpdateDataQueryS.asp

Harnessing the Power of Updatable Queries
http://msdn.microsoft.com/library/default.asp?url=/archive/en-us/dnaraccgen/html/msdn_harness.asp

ACC2000: Cannot Edit or Update Record on a Form
http://support.microsoft.com/default.aspx?scid=kb;en-us;209571&Product=acc

ACC2000: Status Bar Displays "Recordset Not Updateable" Message When
You Try to Update a Linked Table
http://support.microsoft.com/default.aspx?scid=kb;en-us;304179&Product=acc

INFO: Troubleshooting Errors That May Occur When You Update Data in
Queries and in Forms
http://support.microsoft.com/default.aspx?scid=kb;en-us;328828&Product=acc

ACC: Update Query Based on Totals Query Fails
http://support.microsoft.com/default.aspx?scid=kb;en-us;116142&Product=acc

Hi Ken,

Thanks for the effort! My accute problem has indeed been solved already by
combining the two columns to join on in one single column, but they're nice
links to bookmark for further reading to gain a bit more insight in what's
going on.

Have a nice weekend,
Kind regards,
Carl Colijn
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top