urgent ! Delete Dupe records and also Append Fields

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

Hi I have table PROCESS with 250,000 records in it. TWO fields ID, BatchName
are in this table.

Problem 1: Some records get repeated for same ID, with same BatchName. In
this case, I would just like to remove the duplicates records.

Problem 2: Some records get repeated for same Id but with a different
batchName for that Id. In this case I would like remove all these extra
batchnames and create new fields for them and put all in one row.

What is the simplest way to to achieve this both.

THANKS IN ADVANCE
 
Hi,


1-

SELECT id, LAST(BatchName), LAST(description), LAST(dateTime)
FROM myTable
GROUP BY id


2-

Rank the names, first, then, use a crosstab query.

query1: (to get distinct names)

SELECT id, LAST(BatchName) as BName
FROM myTable
GROUP BY id


query2: (to rank the names)

SELECT a.id, a.BName, COUNT(*) As rank
FROM query1 As a INNER JOIN query1 As b
ON a.id=b.id AND a.BName <= b.BName
GROUP BY a.id, a.BName


finally:

TRANSFORM LAST(a.BatchName)
SELECT a.id
FROM myTable As a INNER JOIN query2 As b
ON a.id=b.id
GROUP BY a.id
PIVOT b.rank




Hoping it may help,
Vanderghast, Access MVP
 
Hi,


A GROUP BY query, or a TOTAL query aggregate (SUM, MAX, MIN, COUNT,
.... ) fields for the supplied groups. The first query thus produce ONE row
per id, and "aggregate" fields BatchName, Description and DateTime by taking
one record in the initial data that has the "id" of the group.


A crosstab query (TRANSFORM) creates new fields for each different
values in the expression supplied as PIVOT. In our cases, the created fields
are 1, 2, 3, 4, .... The GROUP BY supply the rows, and the grid is filled
with the aggregate LAST on BatchName (basically, meaning take "a" value
belonging to the same group, from a record having the same PIVOT result.

In general, with values:

f1 f2 f3
a 1 1010
a 2 2020
a 3 3030
b 1 0101
b 2 0202
b 4 0404


a crosstab group by f1, pivot on f2, aggregate with LAST(f3) would return:


f1 1 2 3 4
a 1010 2020 3030 null
b 0101 0202 null 0404



For a join, we can start with a Cartesian join which implies to merge all
records of one table with the one of another table. As example, if a tableA
has records with values a, b, and c while tableB has records with values 1,
2, 3, 4, then

SELECT f1, g1
FROM tableA, tableB


would produce

f1 g1
a 1
a 2
a 3
a 4
b 1
b 2
b 3
b 4
c 1
c 2
c 3
c 4



With an inner join, we only keep those results that the ON expression
evalutates to true.

SELECT b_1.g1, b_2.g1
FROM tableB As b_1 INNER JOIN tableB As b_2
ON b_1.g1 >= b_2.g1

so, if we start from the cartesian join, and evaluates the ON expression:

b_1.g1 b_2.g1 b_1.g1 >= b_2.g1
1 1 true
1 2 false
1 3 false
1 4 false
2 1 true
2 2 true
2 3 false
2 4 false
3 1 true
3 2 true
3 3 true
3 4 false
4 1 true
4 2 true
4 3 true
4 4 true



and we keep only the rows of the result where our expression evaluates to
true



An outer join will be like an inner join, except that if a row completly
dissapear of the result, the row is re-injected. With Jet, only one of the
table may see its row so "protected", it is the one that appears to the LEFT
or to the RIGHT of the word JOIN:


SELECT b_1.g1, b_2.g1
FROM tableB As b_1 LEFT JOIN tableB As b_2
ON b_1.g1 > b_2.g1



b_1.g1 b_2.g1 b_1.g1 >= b_2.g1

1 1 false
1 2 false
1 3 false
1 4 false
1 NULL re-injected

2 1 true
2 2 true
2 3 false
2 4 false

3 1 true
3 2 true
3 3 true
3 4 false

4 1 true
4 2 true
4 3 true
4 4 false


the end result is thus


b_1.g1 b_2.g1
1 NULL
2 1
3 1
3 2
4 1
4 2
4 3





Sure, that is a very short explanation. You may try to consult books
dedicated to SQL for much more detailled explanations.



Hoping it may help,
Vanderghast, Access MVP
 
Thanks for taking the trouble to explain. The ideas from your query did help
me in coming up with soultions, thanks
 
Back
Top