UPDATE with INNER JOIN

  • Thread starter Thread starter David Rigler
  • Start date Start date
D

David Rigler

UPDATE temp1 INNER JOIN Data2
ON temp1.[Date]+temp1.[Time] = Data2.[Date]+Data2.[Time]
SET temp1.Data2 = Data2.Data2

running that against tables with approx 500,000 records causes the CPU
to go to 100% and it ran overnight before i killed it

although it doesnt produce a syntax error there is a problem. So what
would be the better way

dave
 
I would try the following after I BACKUP the data.

UPDATE temp1 INNER JOIN Data2
ON temp1.[Date] = Data2.[Date] AND
temp1.[Time] = Data2.[Time]
SET temp1.Data2 = Data2.Data2

Also, if you don't have an index on the fields Date and Time then I would
look at adding the indexes if performance is still slow.

Your method of adding the time to the date means that no indexes can be used
since your are building a calculated field in the join.

Note: While this is not probable, you may still have had some problem with
matches if Time is a datetime field. Since time is stored as the decimal
portion of a double number (floating point number) which can introduce some
small discrepancies in the accuracy of the calculated value.
 
Sheesh, thanks, I didnt realise that '+' was having such a terrible effect.

In the light of that would you suggest a change to

INSERT INTO temp1 ([Date],[Time],Data1,Data2,Data3)
SELECT [Date],[Time], NULL, Data2, NULL
FROM Data2
WHERE Data2.[Date]+Data2.[Time] NOT IN (SELECT [Date]+[Time] FROM temp1)


dave
John Spencer wrote:
I would try the following after I BACKUP the data.

UPDATE temp1 INNER JOIN Data2
ON temp1.[Date] = Data2.[Date] AND
temp1.[Time] = Data2.[Time]
SET temp1.Data2 = Data2.Data2

Also, if you don't have an index on the fields Date and Time then I would
look at adding the indexes if performance is still slow.

Your method of adding the time to the date means that no indexes can be used
since your are building a calculated field in the join.

Note: While this is not probable, you may still have had some problem with
matches if Time is a datetime field. Since time is stored as the decimal
portion of a double number (floating point number) which can introduce some
small discrepancies in the accuracy of the calculated value.


David Rigler said:
UPDATE temp1 INNER JOIN Data2
ON temp1.[Date]+temp1.[Time] = Data2.[Date]+Data2.[Time]
SET temp1.Data2 = Data2.Data2

running that against tables with approx 500,000 records causes the CPU to
go to 100% and it ran overnight before i killed it

although it doesnt produce a syntax error there is a problem. So what
would be the better way

dave
 
No, I would not. I would suggest what I suggested earlier. If that is not
working, then post back with your problem - an error message (what is it),
wrong data, etc.

By the way NOT IN is known to be slow.


David Rigler said:
Sheesh, thanks, I didnt realise that '+' was having such a terrible
effect.

In the light of that would you suggest a change to

INSERT INTO temp1 ([Date],[Time],Data1,Data2,Data3)
SELECT [Date],[Time], NULL, Data2, NULL
FROM Data2
WHERE Data2.[Date]+Data2.[Time] NOT IN (SELECT [Date]+[Time] FROM temp1)


dave
John Spencer wrote:
I would try the following after I BACKUP the data.

UPDATE temp1 INNER JOIN Data2
ON temp1.[Date] = Data2.[Date] AND
temp1.[Time] = Data2.[Time]
SET temp1.Data2 = Data2.Data2

Also, if you don't have an index on the fields Date and Time then I would
look at adding the indexes if performance is still slow.

Your method of adding the time to the date means that no indexes can be
used since your are building a calculated field in the join.

Note: While this is not probable, you may still have had some problem
with matches if Time is a datetime field. Since time is stored as the
decimal portion of a double number (floating point number) which can
introduce some small discrepancies in the accuracy of the calculated
value.


David Rigler said:
UPDATE temp1 INNER JOIN Data2
ON temp1.[Date]+temp1.[Time] = Data2.[Date]+Data2.[Time]
SET temp1.Data2 = Data2.Data2

running that against tables with approx 500,000 records causes the CPU
to go to 100% and it ran overnight before i killed it

although it doesnt produce a syntax error there is a problem. So what
would be the better way

dave
 
Not quite sure i understand you there, when you say "I would suggest
what I suggested earlier." ?

I'm merging one table into another by UPDATING those records that
allready exist and INSERTING those that dont. If the '+' was bad for the
UPDATE should i change the INSERT

dave

John Spencer wrote:
No, I would not. I would suggest what I suggested earlier. If that is not
working, then post back with your problem - an error message (what is it),
wrong data, etc.

By the way NOT IN is known to be slow.


David Rigler said:
Sheesh, thanks, I didnt realise that '+' was having such a terrible
effect.

In the light of that would you suggest a change to

INSERT INTO temp1 ([Date],[Time],Data1,Data2,Data3)
SELECT [Date],[Time], NULL, Data2, NULL
FROM Data2
WHERE Data2.[Date]+Data2.[Time] NOT IN (SELECT [Date]+[Time] FROM temp1)


dave
John Spencer wrote:
I would try the following after I BACKUP the data.

UPDATE temp1 INNER JOIN Data2
ON temp1.[Date] = Data2.[Date] AND
temp1.[Time] = Data2.[Time]
SET temp1.Data2 = Data2.Data2

Also, if you don't have an index on the fields Date and Time then I would
look at adding the indexes if performance is still slow.

Your method of adding the time to the date means that no indexes can be
used since your are building a calculated field in the join.

Note: While this is not probable, you may still have had some problem
with matches if Time is a datetime field. Since time is stored as the
decimal portion of a double number (floating point number) which can
introduce some small discrepancies in the accuracy of the calculated
value.


UPDATE temp1 INNER JOIN Data2
ON temp1.[Date]+temp1.[Time] = Data2.[Date]+Data2.[Time]
SET temp1.Data2 = Data2.Data2

running that against tables with approx 500,000 records causes the CPU
to go to 100% and it ran overnight before i killed it

although it doesnt produce a syntax error there is a problem. So what
would be the better way

dave
 
Normally, you only update or insert in a query. You don't do both.

Access can do both at once with an outer join. (So I've read.) I have not
ever done that and probably won't.

I would use two queries.

Update existing records:
UPDATE temp1 INNER JOIN Data2
ON temp1.[Date] = Data2.[Date] AND
temp1.[Time] = Data2.[Time]
SET temp1.Data2 = Data2.Data2

Insert New Records:
Insert into Temp1([Date],[Time],Data2)
SELECT Data2.[Date],Data2.[Time],Data2.Data2
FROM Data2 LEFT JOIN Temp
ON Data2.[Date] = Temp1.[Date] AND
Data2.[Time] = Temp1.[Time]
WHERE Temp1.Date is Null and Temp1.Time is Null

David Rigler said:
Not quite sure i understand you there, when you say "I would suggest what
I suggested earlier." ?

I'm merging one table into another by UPDATING those records that allready
exist and INSERTING those that dont. If the '+' was bad for the UPDATE
should i change the INSERT

dave

John Spencer wrote:
No, I would not. I would suggest what I suggested earlier. If that is
not working, then post back with your problem - an error message (what is
it), wrong data, etc.

By the way NOT IN is known to be slow.


David Rigler said:
Sheesh, thanks, I didnt realise that '+' was having such a terrible
effect.

In the light of that would you suggest a change to

INSERT INTO temp1 ([Date],[Time],Data1,Data2,Data3)
SELECT [Date],[Time], NULL, Data2, NULL
FROM Data2
WHERE Data2.[Date]+Data2.[Time] NOT IN (SELECT [Date]+[Time] FROM temp1)


dave

John Spencer wrote:
I would try the following after I BACKUP the data.

UPDATE temp1 INNER JOIN Data2
ON temp1.[Date] = Data2.[Date] AND
temp1.[Time] = Data2.[Time]
SET temp1.Data2 = Data2.Data2

Also, if you don't have an index on the fields Date and Time then I
would look at adding the indexes if performance is still slow.

Your method of adding the time to the date means that no indexes can be
used since your are building a calculated field in the join.

Note: While this is not probable, you may still have had some problem
with matches if Time is a datetime field. Since time is stored as the
decimal portion of a double number (floating point number) which can
introduce some small discrepancies in the accuracy of the calculated
value.


UPDATE temp1 INNER JOIN Data2
ON temp1.[Date]+temp1.[Time] = Data2.[Date]+Data2.[Time]
SET temp1.Data2 = Data2.Data2

running that against tables with approx 500,000 records causes the CPU
to go to 100% and it ran overnight before i killed it

although it doesnt produce a syntax error there is a problem. So what
would be the better way

dave
 
Thanks, thats been an excellent SQL lesson

dave
John Spencer wrote:
Normally, you only update or insert in a query. You don't do both.

Access can do both at once with an outer join. (So I've read.) I have not
ever done that and probably won't.

I would use two queries.

Update existing records:
UPDATE temp1 INNER JOIN Data2
ON temp1.[Date] = Data2.[Date] AND
temp1.[Time] = Data2.[Time]
SET temp1.Data2 = Data2.Data2

Insert New Records:
Insert into Temp1([Date],[Time],Data2)
SELECT Data2.[Date],Data2.[Time],Data2.Data2
FROM Data2 LEFT JOIN Temp
ON Data2.[Date] = Temp1.[Date] AND
Data2.[Time] = Temp1.[Time]
WHERE Temp1.Date is Null and Temp1.Time is Null

David Rigler said:
Not quite sure i understand you there, when you say "I would suggest what
I suggested earlier." ?

I'm merging one table into another by UPDATING those records that allready
exist and INSERTING those that dont. If the '+' was bad for the UPDATE
should i change the INSERT

dave

John Spencer wrote:
No, I would not. I would suggest what I suggested earlier. If that is
not working, then post back with your problem - an error message (what is
it), wrong data, etc.

By the way NOT IN is known to be slow.


Sheesh, thanks, I didnt realise that '+' was having such a terrible
effect.

In the light of that would you suggest a change to

INSERT INTO temp1 ([Date],[Time],Data1,Data2,Data3)
SELECT [Date],[Time], NULL, Data2, NULL
FROM Data2
WHERE Data2.[Date]+Data2.[Time] NOT IN (SELECT [Date]+[Time] FROM temp1)


dave

John Spencer wrote:
I would try the following after I BACKUP the data.

UPDATE temp1 INNER JOIN Data2
ON temp1.[Date] = Data2.[Date] AND
temp1.[Time] = Data2.[Time]
SET temp1.Data2 = Data2.Data2

Also, if you don't have an index on the fields Date and Time then I
would look at adding the indexes if performance is still slow.

Your method of adding the time to the date means that no indexes can be
used since your are building a calculated field in the join.

Note: While this is not probable, you may still have had some problem
with matches if Time is a datetime field. Since time is stored as the
decimal portion of a double number (floating point number) which can
introduce some small discrepancies in the accuracy of the calculated
value.


UPDATE temp1 INNER JOIN Data2
ON temp1.[Date]+temp1.[Time] = Data2.[Date]+Data2.[Time]
SET temp1.Data2 = Data2.Data2

running that against tables with approx 500,000 records causes the CPU
to go to 100% and it ran overnight before i killed it

although it doesnt produce a syntax error there is a problem. So what
would be the better way

dave
 
Hi,



To insert AND update, in one query, just for info, is:

UPDATE temp1 RIGHT JOIN Data2
ON temp1.[Date] = Data2.[Date] AND
temp1.[Time] = Data2.[Time]
SET temp1.Data2 = Data2.Data2,
temp1.[Date] = Data2.[Date],
temp1.[Time] = Data2.[Time]


which is like the first query except that instead of INNER, you use a RIGHT
join (to keep all the updating data), and update the fields in the ON
clause, in the SET clause.


Someone can DEFINE an update like this: Take a look at

SELECT temp1.Data2 , Data2.Data2,
temp1.[Date] , Data2.[Date],
temp1.[Time], Data2.[Time]

FROM temp1 RIGHT JOIN Data2
ON temp1.[Date] = Data2.[Date] AND
temp1.[Time] = Data2.[Time]

where the SELECT takes, and lists, the values involved in the SET clause of
the update.

After the update, you want see, with the same SELECT, the columns but,
now,where

temp1.Data2 will be = Data2.Data2,
temp1.[Date] will be = Data2.[Date],
temp1.[Time] will be = Data2.[Time]


which is the SET clause, without the "will be".


Hoping it may help,
Vanderghast, Access MVP


David Rigler said:
Thanks, thats been an excellent SQL lesson

dave
John Spencer wrote:
Normally, you only update or insert in a query. You don't do both.

Access can do both at once with an outer join. (So I've read.) I have
not ever done that and probably won't.

I would use two queries.

Update existing records:
UPDATE temp1 INNER JOIN Data2
ON temp1.[Date] = Data2.[Date] AND
temp1.[Time] = Data2.[Time]
SET temp1.Data2 = Data2.Data2

Insert New Records:
Insert into Temp1([Date],[Time],Data2)
SELECT Data2.[Date],Data2.[Time],Data2.Data2
FROM Data2 LEFT JOIN Temp
ON Data2.[Date] = Temp1.[Date] AND
Data2.[Time] = Temp1.[Time]
WHERE Temp1.Date is Null and Temp1.Time is Null

David Rigler said:
Not quite sure i understand you there, when you say "I would suggest
what I suggested earlier." ?

I'm merging one table into another by UPDATING those records that
allready exist and INSERTING those that dont. If the '+' was bad for the
UPDATE should i change the INSERT

dave


John Spencer wrote:
No, I would not. I would suggest what I suggested earlier. If that is
not working, then post back with your problem - an error message (what
is it), wrong data, etc.

By the way NOT IN is known to be slow.


Sheesh, thanks, I didnt realise that '+' was having such a terrible
effect.

In the light of that would you suggest a change to

INSERT INTO temp1 ([Date],[Time],Data1,Data2,Data3)
SELECT [Date],[Time], NULL, Data2, NULL
FROM Data2
WHERE Data2.[Date]+Data2.[Time] NOT IN (SELECT [Date]+[Time] FROM
temp1)


dave

John Spencer wrote:
I would try the following after I BACKUP the data.

UPDATE temp1 INNER JOIN Data2
ON temp1.[Date] = Data2.[Date] AND
temp1.[Time] = Data2.[Time]
SET temp1.Data2 = Data2.Data2

Also, if you don't have an index on the fields Date and Time then I
would look at adding the indexes if performance is still slow.

Your method of adding the time to the date means that no indexes can
be used since your are building a calculated field in the join.

Note: While this is not probable, you may still have had some problem
with matches if Time is a datetime field. Since time is stored as
the decimal portion of a double number (floating point number) which
can introduce some small discrepancies in the accuracy of the
calculated value.


UPDATE temp1 INNER JOIN Data2
ON temp1.[Date]+temp1.[Time] = Data2.[Date]+Data2.[Time]
SET temp1.Data2 = Data2.Data2

running that against tables with approx 500,000 records causes the
CPU to go to 100% and it ran overnight before i killed it

although it doesnt produce a syntax error there is a problem. So
what would be the better way

dave
 
Back
Top