Excel file size and paste special

G

Guest

Hi,

I initially posted this on a Reporting Services thread, but was adviced to
place it elsewhere. So here it is....

I am experiencing some strange Excel behavior with regards to file size
expansion, and in need of support from MS to see if they have a solution for
it. In essence, an Excel workbook's size gets bloated when a copy / paste
special is performed on cells which contain formulae.

Try the following:
1. Open a Workbook (and add some formulae to the first row)

2. Copy the first row (with the formulas)
3. Select all the rows inserted in step 2, right-click and select "Paste
Special". The options for Paste Special should be
Paste All, Operation None, No Transpose, No Skip Blanks
4. Once pasted, save the workbook (call it "Original Workbook") and close
it.
5. Make a copy of the workbook saved in step 5. Call it "Copy Workbook".
6. Open the copied workbook in step 6, save it and then close it.



You will notice a difference in file size between the "Original Workbook"
and the "Copy Workbook". Depending on the number of rows inserted and pasted
over, this difference could be very significant. In one trial we pasted
about 45 rows, and noticed the following:

Size of Original Workbook = 264KB
Size of Copy Workbook = 226KB
Difference = 38 KB (or about 17 %)

Comparing the file contents using a binary file comparer (like the one in
Source Control) revealed that there were certain chunks of data in the
Original Workbook that was not present in the Copy Workbook. Furthermore,
this data was not in a contiguous section of the file but rather interlaced
between other data.

Also, if the row that was copied in step 3 (that is the template row), did
not contain any formulas there was no difference in file size between the
Original and the Copy.
Additional Information: The behavior occurs only when copy and pasting
excel cells which contain formulae using named ranges.
That is, the Formula is of the form QTY_RANGE * PRICE_RANGE, instead of
using cell references like A1*B1



Thought:Can this have something to do with formulas, perhaps Excel tries to
recalculate all the cells and in the process leaves some scrap information
that gets saved with the file on the first save. Subsequent saves removes
this "scrap" data.


Any advice is appreciated.

Thanks,

Tim
 
G

Guest

I'll try your test later - (it's late here now). But in passing, I'd say the
KEY POINT you made was your "additional" information: that the behavior only
occurs when the formulas contain references to named ranges.

Over the years I've seen Excel do some strange things with named ranges,
including several instances (back when Excel 97 was King) where a file would
simply lose all of the definitions of the names in the workbook! I cannot
say that I recall such an incident in versions after that, but those days
still make me a little gunshy about them, but they are too valuable to not
use at times.
 
G

Guest

Maybe my workbook wasn't large or complex enough or I wasn't trying hard
enough, but I didn't duplicate the problem. You mentioned inserting rows at
step 2, but you didn't say where or how many to insert, so I kind of winged
it.

Excel 2003 (SP2) on a Win XP Pro/SP2 machine: started new workbook. On
Sheet2 I gave 2 cells names (FirstValue and SecondValue). Back on Sheet1 I
put some simple formuls into A1:E1 referencing those cells by their names. I
then inserted 3 rows and performed the Copy | Paste Special operation. Saved
the workbook. Closed it. Went to the saved location and used copy to create
a copy of it on disk. So at that time I had Originalworkbook.xls and Copy of
Originalworkbook.xls. I opened the copy, saved it, closed it and then
checked actual file sizes on disk. Both were exactly the same at 13,824
bytes and both occupied same disk space (16,384 bytes).
 
G

Guest

Apologize for not being more specific. Can you attemt the same, but insert
about 300 rows (given that your formula is rather simple).

I had noted earlier that I inserted about 45 rows, but then again my
formulas were rather complex (using VARRAYS and such).

Thanks,

Tim
 
G

Guest

I'll give it another shot later today. I'll try to perhaps make things a
little more complex as far as the formulas go - something beyond simple + - *
and / operations.
 
G

Guest

OK, I duplicated it. Added a small 3 column lookup table on Sheet2 also,
added a few VLOOKUP() formulas to the one row in Original workbook. Inserted
300 rows and did a copy paste-special into them. Saved the file: 274,528
bytes on disk. Made a copy of it.
Opened Excel
Opened the Copy
Saved it - now at 162,816 bytes.

Now, another interesting thing: open your original (274,528 byte sized) and
then use File Save As and just save it over itself on the drive (answering
YES to the "file exists, overwrite?" warning). File size is now 162,816
bytes (same as the copy).

Shakes head, walks off into the fog muttering "Of all the gin joints in all
the towns in all the world..."
 
G

Guest

Thanks for confirming.

As I mentioned before...
Comparing the file contents using a binary file comparer (like the one in
Source Control) revealed that there were certain chunks of data in the
Original Workbook that was not present in the Copy Workbook. Furthermore,
this data was not in a contiguous section of the file but rather interlaced
between other data.

Can someone in the Excel development team shed some light to this?

Thanks,

Tim
 
G

Guest

I suppose I could ask around. You didn't say which version of Excel you are
using - since you have mentioned the development team, I would presume 2007?
I actually duplicated the problem using 2003. So if you're using 2007, it
would seem this is a condition that may have existed through multiple
generations of Excel.
 
G

Guest

Sorry for not being specific. I am using Excel 2003. I mentioned
"Development", but a more proper term would have been Technical Support (as
development for 2003 has long ended). Essentially, anyone who knows
inner-workings of Excel.

Thanks for working through with me on this.

-Tim
 
G

Guest

I don't think we're going to get far with this - as you noted, that's a 2003
issue and development has stopped and all we're probably going to see are
security fixes to it. You might post the question directly to Microsoft in
the MSFT Excel discussion forum (you have choice to do that on a new message).

I tried the same thing in Excel 2007 using the .xlsx file format and could
not duplicate the issue - both the original file and the saved version (with
300 rows added) were 27KB in size.
 
G

Guest

You're probably right. At the least I would like to see MS publish a KB
article on it (as a known issue), so people know its a problem and don't
think its something to do with their customization assemblies.

Thanks for all your help,

-Tim
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top