urgent ! Delete Dupe records and also Append Fields

Guest · Feb 15, 2005

Hi I have table PROCESS with 250,000 records in it. TWO fields ID, BatchName
are in this table.

Problem 1: Some records get repeated for same ID, with same BatchName. In
this case, I would just like to remove the duplicates records.

Problem 2: Some records get repeated for same Id but with a different
batchName for that Id. In this case I would like remove all these extra
batchnames and create new fields for them and put all in one row.

What is the simplest way to to achieve this both.

THANKS IN ADVANCE

Michel Walsh · Feb 15, 2005

Hi,

1-

SELECT id, LAST(BatchName), LAST(description), LAST(dateTime)
FROM myTable
GROUP BY id

2-

Rank the names, first, then, use a crosstab query.

query1: (to get distinct names)

SELECT id, LAST(BatchName) as BName
FROM myTable
GROUP BY id

query2: (to rank the names)

SELECT a.id, a.BName, COUNT(*) As rank
FROM query1 As a INNER JOIN query1 As b
ON a.id=b.id AND a.BName <= b.BName
GROUP BY a.id, a.BName

finally:

TRANSFORM LAST(a.BatchName)
SELECT a.id
FROM myTable As a INNER JOIN query2 As b
ON a.id=b.id
GROUP BY a.id
PIVOT b.rank

Hoping it may help,
Vanderghast, Access MVP

Michel Walsh · Feb 16, 2005

Hi,

A GROUP BY query, or a TOTAL query aggregate (SUM, MAX, MIN, COUNT,
.... ) fields for the supplied groups. The first query thus produce ONE row
per id, and "aggregate" fields BatchName, Description and DateTime by taking
one record in the initial data that has the "id" of the group.

A crosstab query (TRANSFORM) creates new fields for each different
values in the expression supplied as PIVOT. In our cases, the created fields
are 1, 2, 3, 4, .... The GROUP BY supply the rows, and the grid is filled
with the aggregate LAST on BatchName (basically, meaning take "a" value
belonging to the same group, from a record having the same PIVOT result.

In general, with values:

f1 f2 f3
a 1 1010
a 2 2020
a 3 3030
b 1 0101
b 2 0202
b 4 0404

a crosstab group by f1, pivot on f2, aggregate with LAST(f3) would return:

f1 1 2 3 4
a 1010 2020 3030 null
b 0101 0202 null 0404

For a join, we can start with a Cartesian join which implies to merge all
records of one table with the one of another table. As example, if a tableA
has records with values a, b, and c while tableB has records with values 1,
2, 3, 4, then

SELECT f1, g1
FROM tableA, tableB

would produce

f1 g1
a 1
a 2
a 3
a 4
b 1
b 2
b 3
b 4
c 1
c 2
c 3
c 4

With an inner join, we only keep those results that the ON expression
evalutates to true.

SELECT b_1.g1, b_2.g1
FROM tableB As b_1 INNER JOIN tableB As b_2
ON b_1.g1 >= b_2.g1

so, if we start from the cartesian join, and evaluates the ON expression:

b_1.g1 b_2.g1 b_1.g1 >= b_2.g1
1 1 true
1 2 false
1 3 false
1 4 false
2 1 true
2 2 true
2 3 false
2 4 false
3 1 true
3 2 true
3 3 true
3 4 false
4 1 true
4 2 true
4 3 true
4 4 true

and we keep only the rows of the result where our expression evaluates to
true

An outer join will be like an inner join, except that if a row completly
dissapear of the result, the row is re-injected. With Jet, only one of the
table may see its row so "protected", it is the one that appears to the LEFT
or to the RIGHT of the word JOIN:

SELECT b_1.g1, b_2.g1
FROM tableB As b_1 LEFT JOIN tableB As b_2
ON b_1.g1 > b_2.g1

b_1.g1 b_2.g1 b_1.g1 >= b_2.g1

1 1 false
1 2 false
1 3 false
1 4 false
1 NULL re-injected

2 1 true
2 2 true
2 3 false
2 4 false

3 1 true
3 2 true
3 3 true
3 4 false

4 1 true
4 2 true
4 3 true
4 4 false

the end result is thus

b_1.g1 b_2.g1
1 NULL
2 1
3 1
3 2
4 1
4 2
4 3

Sure, that is a very short explanation. You may try to consult books
dedicated to SQL for much more detailled explanations.

Hoping it may help,
Vanderghast, Access MVP

Guest · Feb 25, 2005

Thanks for taking the trouble to explain. The ideas from your query did help
me in coming up with soultions, thanks

urgent ! Delete Dupe records and also Append Fields

Guest

Michel Walsh

Michel Walsh

Guest