Is Excel a reasonable platform for this?

Karl Thompson · Aug 21, 2004

I'm an independent developer and my client maintains financial models in
Excel spreadsheets. They currently have 1500+ workbooks, of which 200 are
active at any given time.

The workbooks range in size from about 500k to 2mb. Each workbook has a 1/2
dozen or so named ranges which are one cell.

The client needs summary reports of the 200 active models. So, what I've
done is written an application which uses automation to open each model,
read the cells and write the data to an MS Access database. The reports are
then built in Access.

This worked fine, for a few year. But as you might guess, it takes a long
time now to open each model across a network. (I'm sitting here on a Friday
night and after 50 minutes, the process still is not finished.) Also,
reading the values is not always reliable.

Questions....

First, has anyone attempted something similar? Do you think Access and
Excel are up to the task using Automation and that I just need to improve my
skills and the code?

Or is this streching the limits of what automation is meant for?

I would like to suggest another platform for housing the data, such as SQL
server or even Access. The big problem (for me at least) is I don't see an
obvious way to design a relational database that would even begin to hold
the data in the models. The power that Excel is offering this firm is
basically the ability to track a large amount of data with little or no
common structure between the various models. (An oil firm requires a
different model than say a tech firm. And even the models for 2 oil firms
would not be the same.)

Any thoughts would be appreciated. This can't be a unique problem.

Hank Scorpio · Aug 21, 2004

On Fri, 20 Aug 2004 20:03:47 -0400, "Karl Thompson"

[Pardon the top posting but there's no brief way to put the OP's
problem in context]

I don't think that you're stretching the bounds of Automation as much
as you are the bounds of processing speed and file / network access
time.

Nonetheless, it's probably not an ideal way of doing things for the
reasons that you yourself have described. I'm thinking that a server
based multidimensional solution may be a better one, though probably
not as cheap.

I use a product called TM1 from a company called Applix.
(http://www.applix.com/index.asp)

Putting it in a nutshell, the TM1 data is stored in "cubes". Each cube
is made up of dimensions, with each dimension representing a
particular characteristic of the data; say version, department, site,
GL account, or whatever. Each dimension is made up of elements, each
of which represents one value of that characteristic; for example
Actuals, Budget or Forecast for version, Sydney or Melbourne or
Brisbane etc. for site, Sales, Investment Income, Salaries or whatever
for GL account.

You can create different cubes for different purposes. For example we
have a multi-cube model which stores data relating to our production.
It allows us to generate an internal recharge to our commercial
business units for the units produced on their behalf, as well as
storing relevant stats for the production sites. We have other cubes
which essentially mirror our GL system for reporting, other cubes for
storing relevant market statistical data, etc.

One big advantage is that you aren't bound to storing specific,
pre-defined data field combinations the way you are with a relational
database. You can simply add new elements to your dimensions and just
use the ones that apply to a given type of data. (You specify a value
in a cube at the intersection of one element from each dimension, but
for the dimensions that don't apply to a particular data set you can
just use a "not specified" element. We do this with our payroll
analysis model which draws details of both low level data from the
HRMS system, and high level data from manual journal entries in the GL
system, and combines the two into a single reporting cube.) Cube
storage is sparse, so you're only using memory / disk space for the
combinations of elements which actually contain data.

You can also use "Rules" to define relationships between the various
cubes and calculate values in other (output) cubes.

The data is stored on a server; an application which runs on a normal
Windows server, anywhere on the network. Any user on the network can
then access it if they're given a login and password. We have users
spread across hundreds of kms, all of which can just connect to the
server and use the data on it.

Here's the kicker; the client side is just an add-in to Excel.
Furthermore, the add-in provides worksheet functions which link
directly to the server cubes. So wherever a user is, they can just
load up their worksheet, log on to TM1, press [F9], and get the latest
values straight from your model(s). Similarly they can use those
formulas to send data UP to the cubes... all in real time.

As I said it's not (necessarily) a cheap solution, but it may well be
worth looking into for the sort of thing that you're describing.
(Especially if you're thinking of SQL Server, which isn't all that
cheap either and may be less flexible for what you have in mind.)

Karl Thompson · Aug 23, 2004

Thank you Hank. I think this will be a good starting point.

Hank Scorpio said:
On Fri, 20 Aug 2004 20:03:47 -0400, "Karl Thompson"

[Pardon the top posting but there's no brief way to put the OP's
problem in context]

I don't think that you're stretching the bounds of Automation as much
as you are the bounds of processing speed and file / network access
time.

Nonetheless, it's probably not an ideal way of doing things for the
reasons that you yourself have described. I'm thinking that a server
based multidimensional solution may be a better one, though probably
not as cheap.

I use a product called TM1 from a company called Applix.
(http://www.applix.com/index.asp)

Putting it in a nutshell, the TM1 data is stored in "cubes". Each cube
is made up of dimensions, with each dimension representing a
particular characteristic of the data; say version, department, site,
GL account, or whatever. Each dimension is made up of elements, each
of which represents one value of that characteristic; for example
Actuals, Budget or Forecast for version, Sydney or Melbourne or
Brisbane etc. for site, Sales, Investment Income, Salaries or whatever
for GL account.

You can create different cubes for different purposes. For example we
have a multi-cube model which stores data relating to our production.
It allows us to generate an internal recharge to our commercial
business units for the units produced on their behalf, as well as
storing relevant stats for the production sites. We have other cubes
which essentially mirror our GL system for reporting, other cubes for
storing relevant market statistical data, etc.

One big advantage is that you aren't bound to storing specific,
pre-defined data field combinations the way you are with a relational
database. You can simply add new elements to your dimensions and just
use the ones that apply to a given type of data. (You specify a value
in a cube at the intersection of one element from each dimension, but
for the dimensions that don't apply to a particular data set you can
just use a "not specified" element. We do this with our payroll
analysis model which draws details of both low level data from the
HRMS system, and high level data from manual journal entries in the GL
system, and combines the two into a single reporting cube.) Cube
storage is sparse, so you're only using memory / disk space for the
combinations of elements which actually contain data.

You can also use "Rules" to define relationships between the various
cubes and calculate values in other (output) cubes.

The data is stored on a server; an application which runs on a normal
Windows server, anywhere on the network. Any user on the network can
then access it if they're given a login and password. We have users
spread across hundreds of kms, all of which can just connect to the
server and use the data on it.

Here's the kicker; the client side is just an add-in to Excel.
Furthermore, the add-in provides worksheet functions which link
directly to the server cubes. So wherever a user is, they can just
load up their worksheet, log on to TM1, press [F9], and get the latest
values straight from your model(s). Similarly they can use those
formulas to send data UP to the cubes... all in real time.

As I said it's not (necessarily) a cheap solution, but it may well be
worth looking into for the sort of thing that you're describing.
(Especially if you're thinking of SQL Server, which isn't all that
cheap either and may be less flexible for what you have in mind.)

I'm an independent developer and my client maintains financial models in
Excel spreadsheets. They currently have 1500+ workbooks, of which 200 are
active at any given time.

The workbooks range in size from about 500k to 2mb. Each workbook has a 1/2
dozen or so named ranges which are one cell.

The client needs summary reports of the 200 active models. So, what I've
done is written an application which uses automation to open each model,
read the cells and write the data to an MS Access database. The reports are
then built in Access.

This worked fine, for a few year. But as you might guess, it takes a long
time now to open each model across a network. (I'm sitting here on a Friday
night and after 50 minutes, the process still is not finished.) Also,
reading the values is not always reliable.

Questions....

First, has anyone attempted something similar? Do you think Access and
Excel are up to the task using Automation and that I just need to improve my
skills and the code?

Or is this streching the limits of what automation is meant for?

I would like to suggest another platform for housing the data, such as SQL
server or even Access. The big problem (for me at least) is I don't see an
obvious way to design a relational database that would even begin to hold
the data in the models. The power that Excel is offering this firm is
basically the ability to track a large amount of data with little or no
common structure between the various models. (An oil firm requires a
different model than say a tech firm. And even the models for 2 oil firms
would not be the same.)

Any thoughts would be appreciated. This can't be a unique problem.

Click to expand...