Calculating Duplicate Entries

B

Bad_Shot

I use Excel 2003.
I pulled good info off this forum and off other sites referenced within.

I can do what I’m trying to do with filters, formulas and macros, etc.
But to I can do it faster with a calculator.

Here’s my data:

A B C
1 Sale Price Address Sale Date
2 $25000 216 Oak Ave 12/7/2007
3 $35500 14 Wyman St 1/12/2008
4 $29000 216 Oak Ave 7/1/2008
5 $49000 14 Wyman St 9/30/2008
6 $45000 216 Oak Ave 10/2/2008

I need to find all sales that were less then 6 months apart, and the sales
price was 25% greater then the previous price. In this case row 6 would be
true.
Note the rows can be from 250 to 1500 per month.

Thanks.
 
M

Max

Your criteria is quite complex. Here's one formulas way to model it up, using
a couple of cols for better clarity in the processes involved ..

Source data as posted in A2:C6,
with real dates presumed chronologic in C2 down

In D2: =COUNTIF($B$2:B2,B2)

In E2, normal ENTER:
=IF($D2>1,INDEX(C$2:C$6,MATCH(1,INDEX((B$2:B$6=$B2)*(D$2:D$6=$D2-1),),0)),"")

In F2, normal ENTER:
=IF($D2>1,INDEX(A$2:A$6,MATCH(1,INDEX((B$2:B$6=$B2)*(D$2:D$6=$D2-1),),0)),"")

Then in G2:
=IF(E2="","",IF(AND(C2-E2<180,(A2-F2)/F2>25%),"x",""))
Copy D2:G2 down to G6. Col G will flag those cases which satisfy the complex
criteria with "x".

Col D marks the instances of the addresses in col B. Cols E/F then extracts
the cost/date of the (n-1)th instance from cols A and C. Col G finally
computes the required comparisons/criteria of dates < 6 mths* apart & cost
delta >25%, and flags such cases.
*taking 1 mth = 30 days

Adapt to suit the extents of your actuals.
--
Max
Singapore
http://savefile.com/projects/236895
Downloads:22,500 Files:370 Subscribers:66
xdemechanik
 
B

Bernard Liengme

If we are permitted to sort by address (since I assume you want less than 6
months, 25% increase for SAME address), then this formula
=IF(AND(MONTH(C3)-MONTH(C2)<6,(A3-A2)/A2>0.25),"X","")
placed in D3 and copied down the column results in an X next to the Oct
sales of Oak Ave.

Note in row 1 I have labels; in row 2 I have the first sale and I have
sorted the data by Address.
best wishes
 
B

Bad_Shot

Thanks.
Bernard Yes correct on both assumptions. I can sort by address or any other
col.
The data starts off with A-U. I then ,via a macro move, it to another
worksheet, deleted stuff I don't need, concatenate two cols to build the
address.
Here's were I would use either your suggestion or Max's and determine
whether the houses have been flipped (sold for more then 25% in 6 months).
 
S

Shane Devenshire

Hi,

Here is a simple way - I will assume the addresses are in column B and the
dates in C, starting on row 2 with titles on row 1.

1. Sort by Address as 1st key, and Date as 2nd key.
2. Highlight all the dates and choose Format, Conditional Formatting, from
the first drop down pick Formula is, in the second box enter the formula
=AND(DATEDIF(C2,C3,"M")<6,B2=B3)
Click Format and choose a color on the Patterns tab. Click OK twice.

This will format the date of all the sales that are less than 6 months apart
for a prarticular address. You could highlight all the data and repeat the
above process but change the formula to read

=AND(DATEDIF($C2,$C3,"M")<6,$B2=$B3)

This would format the entire row.
 
B

Bad_Shot

Shane;
This works, but I've been here before. It still requires me to sreach thru
the 1500 rows to fine the highligthed ones and do the calculations.
 
B

Bad_Shot

Bernard;
We must have something wrong here. This is producing a "X" for duplicate
dates, regardless of the address. I need to play with it some more.
Thanks
 
B

Bad_Shot

Max;
Once I sorted the data, 1st by address, then by Date I got some results.
Unique addresses produced a "1", the second occurance of that address
produced a "2".
The next formulars produced "#NA" results in their respective fields.
I need to break down the nested Index/Match/Index to find out what's going on.

Thanks
 
M

Max

My earlier solution doesn't require any sorting by Address,
it should work "as-is", with the only presumption as stated:
.. real dates presumed chronologic in C2 down

Perhaps rummaging through this working sample might help:
http://freefilehosting.net/download/44fl3
Duplicates flagging w criteria.xls

The #N/As that you're hitting could be due to some "dirty" source data
somewhere that's tripping up the processing, eg: formula returned error
values, extra spaces, invalid dates, etc. Try sleuthing around for these, and
clean it up.

If you still can't get it to work, post a link to a desensitised sample file
(you can use the same filehost above to upload)
--
Max
Singapore
http://savefile.com/projects/236895
Downloads:22,500 Files:370 Subscribers:66
xdemechanik
---
 
B

Bad_Shot

Max;
Got your link, See that it works. I added some more data to your sheet and
it still works.
I've got to get back to my sheet. I must have a ref wrong when typing it in.
I actually have 25 cols of data, vs. the simplified stuff I posted.
I'll keep you posted.
Thanks
 
B

Bad_Shot

Max;
Got it working. Thanks.


Max said:
My earlier solution doesn't require any sorting by Address,
it should work "as-is", with the only presumption as stated:

Perhaps rummaging through this working sample might help:
http://freefilehosting.net/download/44fl3
Duplicates flagging w criteria.xls

The #N/As that you're hitting could be due to some "dirty" source data
somewhere that's tripping up the processing, eg: formula returned error
values, extra spaces, invalid dates, etc. Try sleuthing around for these, and
clean it up.

If you still can't get it to work, post a link to a desensitised sample file
(you can use the same filehost above to upload)
--
Max
Singapore
http://savefile.com/projects/236895
Downloads:22,500 Files:370 Subscribers:66
xdemechanik
 
B

Bad_Shot

Thanks Again;
Up for another question or should I make another post?
As I stated before my rows can be from 250 to 1500.
In the example below. I have to keep changing the last row to ensure I get
all the data. I have about 50 formulars in the spread sheet that require
changing.
Any easy way to do this?
Thanks again!

=MEDIAN(IF(B30:B499<=(TODAY()-90),IF(B30:B499>=(TODAY()-180),E30:E499)))
 
M

Max

Thanks for all the ratings.
Up for another question or should I make another post?
Pl put in new queries as new posts. Maximizes your query's exposure to ALL
responders, and it makes for better archiving/retrievals in future, too.
--
Max
Singapore
http://savefile.com/projects/236895
Downloads:22,500 Files:370 Subscribers:66
xdemechanik
---
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Top