PC Review


Reply
Thread Tools Rate Thread

How to deal with a growing dataset...

 
 
=?Utf-8?B?Sm9lICBQb3Jra2EgW01TRlRd?=
Guest
Posts: n/a
 
      17th Aug 2007
I have a sheet that looks like
User# 1/10/2007 1/17/2007 1/21/2007
1 0 21 44
2 0 45 66
3 0 22 76
4 0 13 45
5 0 45 90

So basically, every week the number of columns will increase as I collect
more data -- my table continues getting bigger and bigger.
Someplace then I will have an column of equations for the current data
for each user that looks like
= (D2-c2) * something

Where "D2" and "C2" should be the two most recent columns in the above table


This seems like such an obvious thing, yet I've rarely seen a sample
spreadsheet that deals with a constantly growing table. I have struggled with
this problem in various forms for a long time and have never found a good
solution.
The alternative here would be to manually shift the data over each period
when
entering the new data - or manually edit those equations.

 
Reply With Quote
 
 
 
 
=?Utf-8?B?RGF2ZQ==?=
Guest
Posts: n/a
 
      17th Aug 2007
Quick fix to this problem that i could think of would be using the =MAX
function coupled with the =LOOKUP function (assuming the dates are in
assending order). cant quite understand what you are asking. Max obviosly
finds the most recent date and the lookup can reference what cells you need

"Joe Porkka [MSFT]" wrote:

> I have a sheet that looks like
> User# 1/10/2007 1/17/2007 1/21/2007
> 1 0 21 44
> 2 0 45 66
> 3 0 22 76
> 4 0 13 45
> 5 0 45 90
>
> So basically, every week the number of columns will increase as I collect
> more data -- my table continues getting bigger and bigger.
> Someplace then I will have an column of equations for the current data
> for each user that looks like
> = (D2-c2) * something
>
> Where "D2" and "C2" should be the two most recent columns in the above table
>
>
> This seems like such an obvious thing, yet I've rarely seen a sample
> spreadsheet that deals with a constantly growing table. I have struggled with
> this problem in various forms for a long time and have never found a good
> solution.
> The alternative here would be to manually shift the data over each period
> when
> entering the new data - or manually edit those equations.
>

 
Reply With Quote
 
=?Utf-8?B?RWxrYXI=?=
Guest
Posts: n/a
 
      17th Aug 2007
You could use the LARGE function to find the two largest dates in row 1.

LARGE(1:1,1) finds first largest date
LARGE(1:1,2) finds second largest date

Then, you do something like:

=(HLOOKUP(LARGE(1:1,1),A1:Z10,2)-HLOOKUP(LARGE(1:1,2),A1:Z10,2))*something

Adjust the ranges to meet your needs.

HTH,
Elkar


"Joe Porkka [MSFT]" wrote:

> I have a sheet that looks like
> User# 1/10/2007 1/17/2007 1/21/2007
> 1 0 21 44
> 2 0 45 66
> 3 0 22 76
> 4 0 13 45
> 5 0 45 90
>
> So basically, every week the number of columns will increase as I collect
> more data -- my table continues getting bigger and bigger.
> Someplace then I will have an column of equations for the current data
> for each user that looks like
> = (D2-c2) * something
>
> Where "D2" and "C2" should be the two most recent columns in the above table
>
>
> This seems like such an obvious thing, yet I've rarely seen a sample
> spreadsheet that deals with a constantly growing table. I have struggled with
> this problem in various forms for a long time and have never found a good
> solution.
> The alternative here would be to manually shift the data over each period
> when
> entering the new data - or manually edit those equations.
>

 
Reply With Quote
 
=?Utf-8?B?RHVrZSBDYXJleQ==?=
Guest
Posts: n/a
 
      17th Aug 2007
Joe -

You are running into 2 separate issues:

The first is trying to use a spreadsheet to track data that screams out to
be in a DATABASE.

The second is that your data is structured awkwardly.

Based on your sample data, you are far better off to have 3 and only 3
columns:

UserID, Date, and Value

Once Your data is in that format, you can use a simple SUMPRODUCT() formula
to calculate the value you are seeking.

Another point about the suggested layout is that it allows you to do pivot
tables and other analyses easy as anything, while your current structure
makes analysis very difficult.


"Joe Porkka [MSFT]" wrote:

> I have a sheet that looks like
> User# 1/10/2007 1/17/2007 1/21/2007
> 1 0 21 44
> 2 0 45 66
> 3 0 22 76
> 4 0 13 45
> 5 0 45 90
>
> So basically, every week the number of columns will increase as I collect
> more data -- my table continues getting bigger and bigger.
> Someplace then I will have an column of equations for the current data
> for each user that looks like
> = (D2-c2) * something
>
> Where "D2" and "C2" should be the two most recent columns in the above table
>
>
> This seems like such an obvious thing, yet I've rarely seen a sample
> spreadsheet that deals with a constantly growing table. I have struggled with
> this problem in various forms for a long time and have never found a good
> solution.
> The alternative here would be to manually shift the data over each period
> when
> entering the new data - or manually edit those equations.
>

 
Reply With Quote
 
=?Utf-8?B?Sm9lICBQb3Jra2EgW01TRlRd?=
Guest
Posts: n/a
 
      17th Aug 2007
Hey that sounds like the right approach...
You are right - eventually I do want a database, but in the meantime I want
to get something working a little sooner.

I don't understand how SUMPRODUCT does the computation necessary.
Given the table of 3 columns (user, date, currentData), what I would want is
(in real life the data would be sorted by date, not user, but that shouldn't
matter here at all).
A B C
1 User Date Reading
2 1 1/10/2007 0
3 1 1/17/2007 21
4 1 1/21/2007 44
5 2 1/10/2007 0
6 2 1/17/2007 45
7 2 1/21/2007 66
8 3 1/10/2007 0
9 3 1/17/2007 22
10 3 1/21/2007 76
11 4 1/10/2007 0
12 4 1/17/2007 13
13 4 1/21/2007 45
14 5 1/10/2007 0
15 5 1/17/2007 45
16 5 1/21/2007 90


For each unique user
select 2 most recent rows and subtract currentData.
So I would end up with data that looks like:
user Delta Comment
1 23 "C4-c3"
2 21 "c7-c6"
3 54 "c10-c9"
4 32
5 45



"Duke Carey" wrote:

> Joe -
>
> You are running into 2 separate issues:
>
> The first is trying to use a spreadsheet to track data that screams out to
> be in a DATABASE.
>
> The second is that your data is structured awkwardly.
>
> Based on your sample data, you are far better off to have 3 and only 3
> columns:
>
> UserID, Date, and Value
>
> Once Your data is in that format, you can use a simple SUMPRODUCT() formula
> to calculate the value you are seeking.
>
> Another point about the suggested layout is that it allows you to do pivot
> tables and other analyses easy as anything, while your current structure
> makes analysis very difficult.
>
>
> "Joe Porkka [MSFT]" wrote:
>
> > I have a sheet that looks like
> > User# 1/10/2007 1/17/2007 1/21/2007
> > 1 0 21 44
> > 2 0 45 66
> > 3 0 22 76
> > 4 0 13 45
> > 5 0 45 90
> >
> > So basically, every week the number of columns will increase as I collect
> > more data -- my table continues getting bigger and bigger.
> > Someplace then I will have an column of equations for the current data
> > for each user that looks like
> > = (D2-c2) * something
> >
> > Where "D2" and "C2" should be the two most recent columns in the above table
> >
> >
> > This seems like such an obvious thing, yet I've rarely seen a sample
> > spreadsheet that deals with a constantly growing table. I have struggled with
> > this problem in various forms for a long time and have never found a good
> > solution.
> > The alternative here would be to manually shift the data over each period
> > when
> > entering the new data - or manually edit those equations.
> >

 
Reply With Quote
 
=?Utf-8?B?RHVrZSBDYXJleQ==?=
Guest
Posts: n/a
 
      17th Aug 2007
With the user ID in E1, the latest date in F1, and the next latest date in G1

=SUMPRODUCT(--(A2:A16=E1),--(B2:B16=F1),C2:C16)-SUMPRODUCT(--(A2:A16=E1),--(B2:B16=G1),C2:C16)

"Joe Porkka [MSFT]" wrote:

> Hey that sounds like the right approach...
> You are right - eventually I do want a database, but in the meantime I want
> to get something working a little sooner.
>
> I don't understand how SUMPRODUCT does the computation necessary.
> Given the table of 3 columns (user, date, currentData), what I would want is
> (in real life the data would be sorted by date, not user, but that shouldn't
> matter here at all).
> A B C
> 1 User Date Reading
> 2 1 1/10/2007 0
> 3 1 1/17/2007 21
> 4 1 1/21/2007 44
> 5 2 1/10/2007 0
> 6 2 1/17/2007 45
> 7 2 1/21/2007 66
> 8 3 1/10/2007 0
> 9 3 1/17/2007 22
> 10 3 1/21/2007 76
> 11 4 1/10/2007 0
> 12 4 1/17/2007 13
> 13 4 1/21/2007 45
> 14 5 1/10/2007 0
> 15 5 1/17/2007 45
> 16 5 1/21/2007 90
>
>
> For each unique user
> select 2 most recent rows and subtract currentData.
> So I would end up with data that looks like:
> user Delta Comment
> 1 23 "C4-c3"
> 2 21 "c7-c6"
> 3 54 "c10-c9"
> 4 32
> 5 45
>
>
>
> "Duke Carey" wrote:
>
> > Joe -
> >
> > You are running into 2 separate issues:
> >
> > The first is trying to use a spreadsheet to track data that screams out to
> > be in a DATABASE.
> >
> > The second is that your data is structured awkwardly.
> >
> > Based on your sample data, you are far better off to have 3 and only 3
> > columns:
> >
> > UserID, Date, and Value
> >
> > Once Your data is in that format, you can use a simple SUMPRODUCT() formula
> > to calculate the value you are seeking.
> >
> > Another point about the suggested layout is that it allows you to do pivot
> > tables and other analyses easy as anything, while your current structure
> > makes analysis very difficult.
> >
> >
> > "Joe Porkka [MSFT]" wrote:
> >
> > > I have a sheet that looks like
> > > User# 1/10/2007 1/17/2007 1/21/2007
> > > 1 0 21 44
> > > 2 0 45 66
> > > 3 0 22 76
> > > 4 0 13 45
> > > 5 0 45 90
> > >
> > > So basically, every week the number of columns will increase as I collect
> > > more data -- my table continues getting bigger and bigger.
> > > Someplace then I will have an column of equations for the current data
> > > for each user that looks like
> > > = (D2-c2) * something
> > >
> > > Where "D2" and "C2" should be the two most recent columns in the above table
> > >
> > >
> > > This seems like such an obvious thing, yet I've rarely seen a sample
> > > spreadsheet that deals with a constantly growing table. I have struggled with
> > > this problem in various forms for a long time and have never found a good
> > > solution.
> > > The alternative here would be to manually shift the data over each period
> > > when
> > > entering the new data - or manually edit those equations.
> > >

 
Reply With Quote
 
=?Utf-8?B?Sm9lICBQb3Jra2EgW01TRlRd?=
Guest
Posts: n/a
 
      26th Aug 2007
While this works, I'm not thrilled with the way it ends up looking in the
workbook.

I did find "structured references" - which seems to be a better match for
what I'm looking for. I've only just started learning about them - but it
seems like a great new feature in Excel 2007.

"Duke Carey" wrote:

> With the user ID in E1, the latest date in F1, and the next latest date in G1
>
> =SUMPRODUCT(--(A2:A16=E1),--(B2:B16=F1),C2:C16)-SUMPRODUCT(--(A2:A16=E1),--(B2:B16=G1),C2:C16)
>
> "Joe Porkka [MSFT]" wrote:
>
> > Hey that sounds like the right approach...
> > You are right - eventually I do want a database, but in the meantime I want
> > to get something working a little sooner.
> >
> > I don't understand how SUMPRODUCT does the computation necessary.
> > Given the table of 3 columns (user, date, currentData), what I would want is
> > (in real life the data would be sorted by date, not user, but that shouldn't
> > matter here at all).
> > A B C
> > 1 User Date Reading
> > 2 1 1/10/2007 0
> > 3 1 1/17/2007 21
> > 4 1 1/21/2007 44
> > 5 2 1/10/2007 0
> > 6 2 1/17/2007 45
> > 7 2 1/21/2007 66
> > 8 3 1/10/2007 0
> > 9 3 1/17/2007 22
> > 10 3 1/21/2007 76
> > 11 4 1/10/2007 0
> > 12 4 1/17/2007 13
> > 13 4 1/21/2007 45
> > 14 5 1/10/2007 0
> > 15 5 1/17/2007 45
> > 16 5 1/21/2007 90
> >
> >
> > For each unique user
> > select 2 most recent rows and subtract currentData.
> > So I would end up with data that looks like:
> > user Delta Comment
> > 1 23 "C4-c3"
> > 2 21 "c7-c6"
> > 3 54 "c10-c9"
> > 4 32
> > 5 45
> >
> >
> >
> > "Duke Carey" wrote:
> >
> > > Joe -
> > >
> > > You are running into 2 separate issues:
> > >
> > > The first is trying to use a spreadsheet to track data that screams out to
> > > be in a DATABASE.
> > >
> > > The second is that your data is structured awkwardly.
> > >
> > > Based on your sample data, you are far better off to have 3 and only 3
> > > columns:
> > >
> > > UserID, Date, and Value
> > >
> > > Once Your data is in that format, you can use a simple SUMPRODUCT() formula
> > > to calculate the value you are seeking.
> > >
> > > Another point about the suggested layout is that it allows you to do pivot
> > > tables and other analyses easy as anything, while your current structure
> > > makes analysis very difficult.
> > >
> > >
> > > "Joe Porkka [MSFT]" wrote:
> > >
> > > > I have a sheet that looks like
> > > > User# 1/10/2007 1/17/2007 1/21/2007
> > > > 1 0 21 44
> > > > 2 0 45 66
> > > > 3 0 22 76
> > > > 4 0 13 45
> > > > 5 0 45 90
> > > >
> > > > So basically, every week the number of columns will increase as I collect
> > > > more data -- my table continues getting bigger and bigger.
> > > > Someplace then I will have an column of equations for the current data
> > > > for each user that looks like
> > > > = (D2-c2) * something
> > > >
> > > > Where "D2" and "C2" should be the two most recent columns in the above table
> > > >
> > > >
> > > > This seems like such an obvious thing, yet I've rarely seen a sample
> > > > spreadsheet that deals with a constantly growing table. I have struggled with
> > > > this problem in various forms for a long time and have never found a good
> > > > solution.
> > > > The alternative here would be to manually shift the data over each period
> > > > when
> > > > entering the new data - or manually edit those equations.
> > > >

 
Reply With Quote
 
 
 
Reply

Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
File size growing growing exponentially Steve Microsoft Excel Misc 6 24th May 2009 06:29 PM
File size growing growing exponentially Steve Microsoft Excel Misc 0 24th May 2009 08:31 AM
how to deal with missing column in a dataset. Heath P. Dillon Microsoft VB .NET 3 18th Feb 2009 11:26 AM
BUG? IE6 Windows XP SP2: iexplore.exe ram is growing and growing... Michael Schwarz Windows XP Internet Explorer 0 14th Oct 2004 04:05 PM
Best method to deal with dataset??? Vivek Microsoft ADO .NET 0 20th Jul 2004 01:56 PM


Features
 

Advertising
 

Newsgroups
 


All times are GMT +1. The time now is 06:33 PM.