Hide Duplicates

G

Guest

Each day I must review a multi-office list of files identified by "Claim ID".
I append these daily lists into one table, then use a query to pull the
files by date range and a few other criteria which target files just for my
specific office. Due to internal procedures, some files from previous days
reappear on the daily list, thus giving me multiple occurrances of the same
Claim ID in the table. Consequently, I review them each time they occur.
Unfortunately, I must save the previous occurrances for record purposes, so
deleting them is not an option.
When I run the query for date "D", how can I have my query ignore previous
occurrances of Claim IDs for dates "A,B or C"? In effect, I want the query
to compare the filtered Claim IDs with those in the unfiltered table, then
only show me the Claim IDs which have only occurred once.
Hope that's clear enough.
 
S

Smartin

Rzig said:
Each day I must review a multi-office list of files identified by "Claim ID".
I append these daily lists into one table, then use a query to pull the
files by date range and a few other criteria which target files just for my
specific office. Due to internal procedures, some files from previous days
reappear on the daily list, thus giving me multiple occurrances of the same
Claim ID in the table. Consequently, I review them each time they occur.
Unfortunately, I must save the previous occurrances for record purposes, so
deleting them is not an option.
When I run the query for date "D", how can I have my query ignore previous
occurrances of Claim IDs for dates "A,B or C"? In effect, I want the query
to compare the filtered Claim IDs with those in the unfiltered table, then
only show me the Claim IDs which have only occurred once.
Hope that's clear enough.

Hi Rzig,

I'm not totally clear on what you mean by "filtered" and "unfiltered",
but will SELECT DISTINCT do what you want? If not, perhaps an unmatched
query, to only show the Claim IDs in the current date that are not in
previous dates?
 
G

Guest

Sorry for the confusing description. Where would I apply Select Distinct?
I don't think find unmatched will work, since it seems intended for
comparison of tables or queries. I have one table.
Maybe I designed the Db wrong. Let me try once more to explain.
On 2/1, I get a list of claims to review. I import them to TblIntake and
apply my query to separate the claims for my office.
On 2/2, I get another list of claims to review. I append that list to
TblIntake and apply my query, using a date range to pull only 2/2 claims.
However, some of the claims from 2/1 could be repeated on the 2/2 list. So
I want my query to ignore any [Claim Id] which occurs on both 2/1 and 2/2.

I think I solved it by adapting a piece of code from a Find Duplicates query
which applied the string. In (SELECT [Claim Id] FROM [TblIntake] As Tmp
GROUP BY [Claim Id] HAVING Count(*)>1 ). I changed the > to =. The query is
slow, but seems to work.
If you still have a suggestion, I'm all ears.
Thanks for the time, all you folks are great.
 
S

Smartin

Rzig said:
Sorry for the confusing description. Where would I apply Select Distinct?
I don't think find unmatched will work, since it seems intended for
comparison of tables or queries. I have one table.
Maybe I designed the Db wrong. Let me try once more to explain.
On 2/1, I get a list of claims to review. I import them to TblIntake and
apply my query to separate the claims for my office.
On 2/2, I get another list of claims to review. I append that list to
TblIntake and apply my query, using a date range to pull only 2/2 claims.
However, some of the claims from 2/1 could be repeated on the 2/2 list. So
I want my query to ignore any [Claim Id] which occurs on both 2/1 and 2/2.

I think I solved it by adapting a piece of code from a Find Duplicates query
which applied the string. In (SELECT [Claim Id] FROM [TblIntake] As Tmp
GROUP BY [Claim Id] HAVING Count(*)>1 ). I changed the > to =. The query is
slow, but seems to work.
If you still have a suggestion, I'm all ears.
Thanks for the time, all you folks are great.

I guess it depends on what you want to do. Your example using HAVING
above will show unique Claim Id's. That is, it will show Claim Id's that
appear exactly once. Adding an index on Claim Id might help the speed issue.

A concern I have with that approach is suppose you are out of the office
on 2/3. On 2/4 you return and load up 2/3 and 2/4. If a claim was "new"
on 2/3 but repeats on the 2/4 list, you will never see it using this query.

Assuming you are keeping a "file date" indicator in your big table one
way around this is to add a WHERE clause that limits the candidates to
two successive days. Something like

WHERE FileDate = [Enter the file date to examine]
OR FileDate = DateAdd ("d", -1, [Enter the file date to examine])

This may need to be modified if for example you don't receive files on
certain days, e.g., weekends.

Coincidentally I am currently working on a very similar project. I need
to compare monthly batches of ~25K claim records against each other. We
expect many to be carried forward, while some claims are new, some
disappear, and some are changed in a significant way. I need to identify
all of these.

My approach has been to write several small queries that identify, for
example, "what is current?" and "what was previously current?" and then
create new queries against the small queries such as matched (to see the
carry forward) and unmatched (what is new, or what disappeared,
depending on which direction the join goes).

Hope this gives you some ideas.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top