Question Regarding Combining Duplicate Records

W

wwachsberger

I have a situation where I have quasi-duplicate data which looks like
the following:

Unique ID Name Location Color Revenue
1 A New York Green 100
2 A New York Green 200
3 A New York Green 300


There are about 10,000 examples of seperate duplicates like this within
my database. I have written a duplicate query in order to identify all
of these quasi-duplicate records. I now wish to combine all of the
duplicates in order to have the records look like the following:

Unique ID Name Location Color Revenue
1 A New York Green 0
2 A New York Green 0
3 A New York Green 600

I can then ignore the Revenues equal to 0. Many thanks for your help
in advance!

Warren
 
W

wwachsberger

Thank you for your response, however I tried that, but because I have
other transactions that are not duplicates, I don't know how to bring
the respective non-duplicates into the duplicate query so that I can
view all of my transactions together, by transaction. I am in extreme
gratitude for your help. Please let me know if this response is
confusing
 
C

chris.nebinger

I'm not understanding. Looking at the data:

Unique ID Name Location Color Revenue
1 A New York Green 100
2 A New York Green 200
3 A New York Green 300
4 A New York Blue 750
5 A Pennslyvania Green 250
6 A Pennslyvania Green 300


A query like:

Select [Name], Location, Color, Sum(Revenue) as TotalRevenue
From TableName
Group By [Name], Location, Color


Would result in:
Name Location Color Revenue
A New York Green 600
A New York Blue 750
A Pennslyvania Green 550


Isn't that what you want?


Chris Nebinger
 
W

wwachsberger

I agree, that does give the required result, and thank you for laying
it out. However, I would like to try and retain transaction level
detail, because I need to perform additional calculations later, on a
transactional basis. So ideally I would like to run a duplicates query
which I have created and then using your example:

Unique ID Name Location Color Revenue
1 A New York Green 100
2 A New York Green 200
3 A New York Green 300
4 A New York Blue 750
5 A Pennslyvania Green 250
6 A Pennslyvania Green 300


I would identify duplicates throughh my duplicate query

In (SELECT [Name] FROM [Query] As Tmp GROUP BY
[Name],[Location],[Color] HAVING Count(*)>1 And [Name] =
[Query].[Name] And [Color] = [Query].[Color] And [Location] =
[Query].[Location])

Giving me

Unique ID Name Location Color Revenue
1 A New York Green 100
2 A New York Green 200
3 A New York Green 300
5 A Pennslyvania Green 250
6 A Pennslyvania Green 300


I would now like a query so that my final output would be:


Unique ID Name Location Color Revenue
1 A New York Green 0
2 A New York Green 0
3 A New York Green 600
4 A New York Blue 750
5 A Pennslyvania Green 0
6 A Pennslyvania Green 550


I know this sounds convoluted, and I apologize for my confusing
responses, however for reasons too detailed to go into in this forum, I
need it by transaction. Thank you all so much for your help.

Regards,

Warren



I'm not understanding. Looking at the data:

Unique ID Name Location Color Revenue
1 A New York Green 100
2 A New York Green 200
3 A New York Green 300
4 A New York Blue 750
5 A Pennslyvania Green 250
6 A Pennslyvania Green 300


A query like:

Select [Name], Location, Color, Sum(Revenue) as TotalRevenue
From TableName
Group By [Name], Location, Color


Would result in:
Name Location Color Revenue
A New York Green 600
A New York Blue 750
A Pennslyvania Green 550


Isn't that what you want?


Chris Nebinger

Thank you for your response, however I tried that, but because I have
other transactions that are not duplicates, I don't know how to bring
the respective non-duplicates into the duplicate query so that I can
view all of my transactions together, by transaction. I am in extreme
gratitude for your help. Please let me know if this response is
confusing
 
J

John Nurick

From the sound of things you *don't* want to combine duplicate records.

Rather - and I can't think why - you need a query that returns all the
records but with "faked" values for Revenue: either the total revenue
for that combination of name, location and colour or zero, depending on
whether the record has the highest value of [Unique ID] for that
combination of name, location and colour. Something like this should do
it; I've included the two fields ActualRevenue and XX to make it easier
to see how it works.

SELECT A.[Unique ID], A.[Name], A.Location, A.Color,

A.Revenue AS ActualRevenue,

(SELECT COUNT(B.[Unique ID])
FROM MyTable AS B
WHERE (B.[Name]=A.[Name]) AND (B.Location=A.Location)
AND (B.Color=B.Color) AND (B.[Unique ID]>A.[Unique ID])
) AS XX,

IIF(
(SELECT COUNT(B.[Unique ID])
FROM MyTable AS B
WHERE (B.[Name]=A.[Name]) AND (B.Location=A.Location)
AND (B.Color=A.Color) AND (B.[Unique ID]>A.[Unique ID])
)=0,
(SELECT SUM(C.Revenue)
FROM MyTable AS C
WHERE (C.[Name]=A.[Name]) AND (C.Location=A.Location)
AND (C.Color=A.Color)
),
0
) AS Revenue

FROM MyTable AS A
ORDER BY A.[Unique ID], A.Revenue
;

I agree, that does give the required result, and thank you for laying
it out. However, I would like to try and retain transaction level
detail, because I need to perform additional calculations later, on a
transactional basis. So ideally I would like to run a duplicates query
which I have created and then using your example:

Unique ID Name Location Color Revenue
1 A New York Green 100
2 A New York Green 200
3 A New York Green 300
4 A New York Blue 750
5 A Pennslyvania Green 250
6 A Pennslyvania Green 300


I would identify duplicates throughh my duplicate query

In (SELECT [Name] FROM [Query] As Tmp GROUP BY
[Name],[Location],[Color] HAVING Count(*)>1 And [Name] =
[Query].[Name] And [Color] = [Query].[Color] And [Location] =
[Query].[Location])

Giving me

Unique ID Name Location Color Revenue
1 A New York Green 100
2 A New York Green 200
3 A New York Green 300
5 A Pennslyvania Green 250
6 A Pennslyvania Green 300


I would now like a query so that my final output would be:


Unique ID Name Location Color Revenue
1 A New York Green 0
2 A New York Green 0
3 A New York Green 600
4 A New York Blue 750
5 A Pennslyvania Green 0
6 A Pennslyvania Green 550


I know this sounds convoluted, and I apologize for my confusing
responses, however for reasons too detailed to go into in this forum, I
need it by transaction. Thank you all so much for your help.

Regards,

Warren



I'm not understanding. Looking at the data:

Unique ID Name Location Color Revenue
1 A New York Green 100
2 A New York Green 200
3 A New York Green 300
4 A New York Blue 750
5 A Pennslyvania Green 250
6 A Pennslyvania Green 300


A query like:

Select [Name], Location, Color, Sum(Revenue) as TotalRevenue
From TableName
Group By [Name], Location, Color


Would result in:
Name Location Color Revenue
A New York Green 600
A New York Blue 750
A Pennslyvania Green 550


Isn't that what you want?


Chris Nebinger

Thank you for your response, however I tried that, but because I have
other transactions that are not duplicates, I don't know how to bring
the respective non-duplicates into the duplicate query so that I can
view all of my transactions together, by transaction. I am in extreme
gratitude for your help. Please let me know if this response is
confusing



Klatuu wrote:
Waht you need is a Totals query.
Create a query in the query builder that contains the field you identify.
Click on the icon on the tool bar that looks like the Greek letter Sigma
(sort of).
You will see a row added to the query builder.
For all the rows execpt Revenue, select Group By. For Revenue, select Sum.
--
Dave Hargis, Microsoft Access MVP


:

I have a situation where I have quasi-duplicate data which looks like
the following:

Unique ID Name Location Color Revenue
1 A New York Green 100
2 A New York Green 200
3 A New York Green 300


There are about 10,000 examples of seperate duplicates like this within
my database. I have written a duplicate query in order to identify all
of these quasi-duplicate records. I now wish to combine all of the
duplicates in order to have the records look like the following:

Unique ID Name Location Color Revenue
1 A New York Green 0
2 A New York Green 0
3 A New York Green 600

I can then ignore the Revenues equal to 0. Many thanks for your help
in advance!

Warren
 
W

wwachsberger

Thank you very much for all of your help this seems to give the results
I need. I really appreciate all of your time.

Thanks again,

Warren
John said:
From the sound of things you *don't* want to combine duplicate records.

Rather - and I can't think why - you need a query that returns all the
records but with "faked" values for Revenue: either the total revenue
for that combination of name, location and colour or zero, depending on
whether the record has the highest value of [Unique ID] for that
combination of name, location and colour. Something like this should do
it; I've included the two fields ActualRevenue and XX to make it easier
to see how it works.

SELECT A.[Unique ID], A.[Name], A.Location, A.Color,

A.Revenue AS ActualRevenue,

(SELECT COUNT(B.[Unique ID])
FROM MyTable AS B
WHERE (B.[Name]=A.[Name]) AND (B.Location=A.Location)
AND (B.Color=B.Color) AND (B.[Unique ID]>A.[Unique ID])
) AS XX,

IIF(
(SELECT COUNT(B.[Unique ID])
FROM MyTable AS B
WHERE (B.[Name]=A.[Name]) AND (B.Location=A.Location)
AND (B.Color=A.Color) AND (B.[Unique ID]>A.[Unique ID])
)=0,
(SELECT SUM(C.Revenue)
FROM MyTable AS C
WHERE (C.[Name]=A.[Name]) AND (C.Location=A.Location)
AND (C.Color=A.Color)
),
0
) AS Revenue

FROM MyTable AS A
ORDER BY A.[Unique ID], A.Revenue
;

I agree, that does give the required result, and thank you for laying
it out. However, I would like to try and retain transaction level
detail, because I need to perform additional calculations later, on a
transactional basis. So ideally I would like to run a duplicates query
which I have created and then using your example:

Unique ID Name Location Color Revenue
1 A New York Green 100
2 A New York Green 200
3 A New York Green 300
4 A New York Blue 750
5 A Pennslyvania Green 250
6 A Pennslyvania Green 300


I would identify duplicates throughh my duplicate query

In (SELECT [Name] FROM [Query] As Tmp GROUP BY
[Name],[Location],[Color] HAVING Count(*)>1 And [Name] =
[Query].[Name] And [Color] = [Query].[Color] And [Location] =
[Query].[Location])

Giving me

Unique ID Name Location Color Revenue
1 A New York Green 100
2 A New York Green 200
3 A New York Green 300
5 A Pennslyvania Green 250
6 A Pennslyvania Green 300


I would now like a query so that my final output would be:


Unique ID Name Location Color Revenue
1 A New York Green 0
2 A New York Green 0
3 A New York Green 600
4 A New York Blue 750
5 A Pennslyvania Green 0
6 A Pennslyvania Green 550


I know this sounds convoluted, and I apologize for my confusing
responses, however for reasons too detailed to go into in this forum, I
need it by transaction. Thank you all so much for your help.

Regards,

Warren



I'm not understanding. Looking at the data:

Unique ID Name Location Color Revenue
1 A New York Green 100
2 A New York Green 200
3 A New York Green 300
4 A New York Blue 750
5 A Pennslyvania Green 250
6 A Pennslyvania Green 300


A query like:

Select [Name], Location, Color, Sum(Revenue) as TotalRevenue
From TableName
Group By [Name], Location, Color


Would result in:
Name Location Color Revenue
A New York Green 600
A New York Blue 750
A Pennslyvania Green 550


Isn't that what you want?


Chris Nebinger

(e-mail address removed) wrote:
Thank you for your response, however I tried that, but because I have
other transactions that are not duplicates, I don't know how to bring
the respective non-duplicates into the duplicate query so that I can
view all of my transactions together, by transaction. I am in extreme
gratitude for your help. Please let me know if this response is
confusing



Klatuu wrote:
Waht you need is a Totals query.
Create a query in the query builder that contains the field you identify.
Click on the icon on the tool bar that looks like the Greek letter Sigma
(sort of).
You will see a row added to the query builder.
For all the rows execpt Revenue, select Group By. For Revenue, select Sum.
--
Dave Hargis, Microsoft Access MVP


:

I have a situation where I have quasi-duplicate data which looks like
the following:

Unique ID Name Location Color Revenue
1 A New York Green 100
2 A New York Green 200
3 A New York Green 300


There are about 10,000 examples of seperate duplicates like this within
my database. I have written a duplicate query in order to identify all
of these quasi-duplicate records. I now wish to combine all of the
duplicates in order to have the records look like the following:

Unique ID Name Location Color Revenue
1 A New York Green 0
2 A New York Green 0
3 A New York Green 600

I can then ignore the Revenues equal to 0. Many thanks for your help
in advance!

Warren
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top