Delete duplicates based on time

Z

Zuzeppeddu

Hi
I have a spreadsheet with two columns. Col1 contains duplicate
records. These duplicates can be two or three or more. Col1 only
contains duplicates i.e. it doesn't have any unique records as I have
filtered them out.

Col2 contains date and time. I want to write a script which would
enable me to delete those duplicates where time difference is less
than or equal to 15 minutes. If it is a pair, one record gets deleted.
It doesn't matter which one. If there are two or more duplicates, one
of them remains and the rest get deleted.

Could anyone please help me write this script?

Thanks
 
D

Don Guillett Excel MVP

Hi
I have a spreadsheet with two columns. Col1 contains duplicate
records. These duplicates can be two or three or more. Col1 only
contains duplicates i.e. it doesn't have any unique records as I have
filtered them out.

Col2 contains date and time. I want to write a script which would
enable me to delete those duplicates where time difference is less
than or equal to 15 minutes. If it is a pair, one record gets deleted.
It doesn't matter which one. If there are two or more duplicates, one
of them remains and the rest get deleted.

Could anyone please help me write this script?

Thanks

Probably not too difficult with a looping macro from the bottom up.
"If desired, send your file to dguillett @gmail.com I will only look
if:
1. You send a copy of this message on an inserted sheet
2. You give me the newsgroup and the subject line
3. You send a clear explanation of what you want
4. You send before/after examples and expected results."
 
Joined
Nov 8, 2010
Messages
2
Reaction score
0
Zuzeppeddu said:
Hi
I have a spreadsheet with two columns. Col1 contains duplicate
records. These duplicates can be two or three or more. Col1 only
contains duplicates i.e. it doesn't have any unique records as I have
filtered them out.

Col2 contains date and time. I want to write a script which would
enable me to delete those duplicates where time difference is less
than or equal to 15 minutes. If it is a pair, one record gets deleted.
It doesn't matter which one. If there are two or more duplicates, one
of them remains and the rest get deleted.

Could anyone please help me write this script?

Thanks

Sort by the date and time. Add a column which calculates the difference between the time on one row and the time on the previous row. Create the code to run through and delete where the time difference value is <= 15 minutes. HOWEVER do not delete 2 consecutive rows! Otherwise if you have 14:10, 14:20, 14:30, 14:40 the differences will all be 10 minutes but you presumably don't want to delete all but the first? What do you want to do in this case? If you ensure you don't delete 2 consecutive rows, as I suggest, you will delete the 14:20 and the 14:40, which may or may not be what you want to achieve.
You will have to repeat the process to get rid of all 'duplicates' E.g. 14:10,14:15,14:20,14:25,14:30. First pass deletes 14:15, and 14:25. 2nd pass will delete 14:20, leaving 14:10 and 14:30

Thinking about it you could just calculate the difference if you don't want an extra column. If you are deleting a row or shifting cells up then the deleted row will be replaced by the next row and so you can repeat the difference calculation to see whether this 'new' next row is to be deleted. Row positioning will get messy if you are deleting the rows 'under-your feet' so to speak.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top