Why the arbitrary spreadsheet size limitations?

T

T Magritte

I have some rather large spreadsheets that are difficult or impossible
to import into Excel.

As an example, I have one with 25 rows and around 200,000 columns.
When I attempt to import into Excel I get this error:

"- Excel cannot exceed the limit of 1,048,576 rows and 16,384 columns.
- By default, Excel places three worksheets in a workbook file. Each
worksheet can contain 1,048,576 rows and 16,384 columns of data, and
workbooks can contain more than three worksheets if your computer has
enough memory to support the additional data."

So why am I allowed to have so many more rows than columns? Why not
limit the size by number of cells rather than an arbitrary number? I
can transpose the data in the originating application to make it fit
Excel's limiations, at least in the case.

Also, it seems like there isn't actually a limit on the amount of data
a workbook can have since you can have about 16 million cells per
sheet but as many sheets as you want (I assume there's some other
limit on number of sheets). So why can't I have a single sheet with 32
million cells rather than 2 sheets with 16 million cells?

Seems if Microsoft were going to the trouble of increasing the
allowable size of spreadsheets, why not make them unrestricted in
size?

Sorry, end of rant...

(yes, I should probably be using Access or mysql or something designed
for large datasets, but sometimes Excel is very nice for quick and
dirty stuff. Access has a horrible import filter anyway, so datasets
with many columns can't be imported even if Access itself were capable
of managing he data. I haven't tried any mysql solutions yet.)
 
C

Charles Williams

Excel 2007 is limited by Windows 32-bit address space space limitations,
which in practice means that it is probably impossible to fill even one
sheet completely with Data/Formulae.

I imagine that for efficiency Excel needs to have fixed maximum dimensions
for rows and columns so that the sparse storage addressing algorithm does
not use too much memory and execution time.

Charles
__________________________________________________
The Excel Calculation Site
http://www.decisionmodels.com
 
G

Gord Dibben

Re-do your math............................

If using Excel 2007( I think you are) you have about 16.5 BILLION cells per
worksheet.

That should do you.


Gord Dibben MS Excel MVP
 
T

T Magritte

My bad, you're right. There are a lot of cells available. But only 1
million rows and 16 k columns. As I said, I'd be happy with the number
of cells available if I could use them in whatever arrangement of rows
and columns I need. But that's not the case. If the data is arranged
with many columns and few rows (as an example, my spreadsheet with 25
rows and 200 k columns) won't fit. I can import it piecemeal into
multiple sheets but it's a pain to import and use that way. Or I can
transpose it prior to import but that may not necessarily be possible
depending on the source of the data. Depending on the arrangement of
rows and columns, it might simply be impossible to import the data
into a single sheet even though the actual amount of data is far below
the limit of the sheet and Windows memory limitations...

Thanks.
 
T

T Magritte

In this case the data comes from a mouse microarray which is a
technology that can measure (more or less) the expression of genes
(basically, all the genes) of a mouse. Of course there are similar
arrays for humans and other species. This one is an exon array, so it
detects expression of individual exons within each gene. (Genes can
express multiple exons that can be arranged in different ways so that
a number of different proteins can be produced from the same gene.)
This, of course, adds a lot more data to the mix...

So in this particular example I'm looking at 25 mice (25 rows) and a
core set of genes that have combined a little under 20 k exons
(columns). (Actually, if you look at the complete array there's more
like a million exons but then the amount of data starts to get really
ridiculous for Excel.)

If you merge the exon data into gene level data there's a little over
16 k genes. So in that case you can almost import the whole data set
into a sheet but it turns out there's a little bit over the column
limit still.

As I said, if I transpose the data before importing into excel so that
there are only 25 columns, it's fine. But it would be better if that
didn't have to be done.

Really, I'm still learning how to do this stuff. There are software
packages more appropriate for dealing with this type of data and
analysis but since I'm much more familiar with Excel I have this
tendency to fall back to it for a lot of quick and dirty stuff.
Despite it's limitations it's a very useful tool and I just wish it
could be more versatile for large sets. But I guess there isn't enough
of a market for an Office Science Edition...

Thanks.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top