Word <-> Excel automation & Too Many Named Ranges?

G

Guest

Hi! 2 weeks effort down the drain unless you have an answer...

[Office XP SP3 running under XP Pro, SP2, 1GB mem)

I have written VBA in MS Word to generate a concordance for a book <=100k
words. Because of bookmark limitations in word (max ~16387) I have written
automation code to pipe data to an Excel spreadsheet, creating named ranges
on the fly - the name containing the range start and end for the target in
the word document. When opened, the workbook recovers the source doc name
from its own and creates a new automation session to goto words/phrases in
the word document based on the range names (and hence range start/end values)

Everything is fine for small documents, but having just processed the book
(24 hours run-time!) whilst no errors were encountered* (* see below), when
attempting to open the resulting 23MB Excel workbook, Excel fails to open the
file, going through a process of repair before deciding it's all hopeless
(for an intermediate snapshot of ~6MB the log file says eventually "Damage to
the file was so extensive that repairs were not possible. Excel attempted to
recover your formulas and values, but some data may have been lost or
corrupted." - yes, there's data but the workbook is functionally completely
messed up; for the full file I don't even get that!).

Question: specifications say that named ranges are limited only by memory,
but when running the automation xl didn't take more than ~50MB-ish (don't
remember exactly) but much less than the available RAM without even
considering virtual memory. However, given that I have approximately 200k
named ranges, is that the source of the problem? One sheet has the full
65536 rows and a full 256 columns for *some* rows (the others have ~8,000 and
45,000 rows respectively).

Any ideas? Am at my wits end... I tried doing the concordance in Word alone
in the first place, then going via XML output from Word into Excel before
direct automation links... I see no other way ahead!

Best regards

Jules

* there was an error early on when attempting to set cell value and name
range, but invoking xlDoc.save from word and resuming allowed execution to
proceed - MS please note!
 
J

Jef Gorbach

Concordance : an alphabetical index of the principal words in a book or the
works of an author with their immediate contexts -
http://www.m-w.com/cgi-bin/dictionary

Im likely misunderstanding something, but seems like your generating an
alphabetical list of unique words within the book, so why not pass each word
to Excel via the same variable thereby saving a TON of processing memory
(and likely make more readable code).
Pass each word to the same column (presuming < 65,536 words) then
sort/reduce the result down to your desired unique values.

"Resignation of an Excel Guru :(" <Resignation of an Excel Guru
:(@discussions.microsoft.com> wrote in message
Hi! 2 weeks effort down the drain unless you have an answer...

[Office XP SP3 running under XP Pro, SP2, 1GB mem)

I have written VBA in MS Word to generate a concordance for a book <=100k
words. Because of bookmark limitations in word (max ~16387) I have written
automation code to pipe data to an Excel spreadsheet, creating named ranges
on the fly - the name containing the range start and end for the target in
the word document. When opened, the workbook recovers the source doc name
from its own and creates a new automation session to goto words/phrases in
the word document based on the range names (and hence range start/end values)

Everything is fine for small documents, but having just processed the book
(24 hours run-time!) whilst no errors were encountered* (* see below), when
attempting to open the resulting 23MB Excel workbook, Excel fails to open the
file, going through a process of repair before deciding it's all hopeless
(for an intermediate snapshot of ~6MB the log file says eventually "Damage to
the file was so extensive that repairs were not possible. Excel attempted to
recover your formulas and values, but some data may have been lost or
corrupted." - yes, there's data but the workbook is functionally completely
messed up; for the full file I don't even get that!).

Question: specifications say that named ranges are limited only by memory,
but when running the automation xl didn't take more than ~50MB-ish (don't
remember exactly) but much less than the available RAM without even
considering virtual memory. However, given that I have approximately 200k
named ranges, is that the source of the problem? One sheet has the full
65536 rows and a full 256 columns for *some* rows (the others have ~8,000 and
45,000 rows respectively).

Any ideas? Am at my wits end... I tried doing the concordance in Word alone
in the first place, then going via XML output from Word into Excel before
direct automation links... I see no other way ahead!

Best regards

Jules

* there was an error early on when attempting to set cell value and name
range, but invoking xlDoc.save from word and resuming allowed execution to
proceed - MS please note!
 
G

Guest

I understand the suggestion but it's not just a list of the words - each
instance of the word (and two-word and three-word phrases on other sheets) is
recorded so that I can review each occurrence in its contex, so each cell
must contain the word (so I can see what which instance I am looking up!) and
be associated with the range information....

I think I might do as you suggest though and dump each word and the range
values into one cell and use character formatting to hide the range data... I
can easily adapt the range-name parser to extract it from cell contents
instead of range name...

But I'd still like to know what the problem is! Having driven word into the
500MB+ memory domain, what is Excel's issue with large numbers of named
ranges (if that is indeed the problem)?

And the code is quite compact and readable actually!

Thanks

Jules

Jef Gorbach said:
Concordance : an alphabetical index of the principal words in a book or the
works of an author with their immediate contexts -
http://www.m-w.com/cgi-bin/dictionary

Im likely misunderstanding something, but seems like your generating an
alphabetical list of unique words within the book, so why not pass each word
to Excel via the same variable thereby saving a TON of processing memory
(and likely make more readable code).
Pass each word to the same column (presuming < 65,536 words) then
sort/reduce the result down to your desired unique values.

"Resignation of an Excel Guru :(" <Resignation of an Excel Guru
:(@discussions.microsoft.com> wrote in message
Hi! 2 weeks effort down the drain unless you have an answer...

[Office XP SP3 running under XP Pro, SP2, 1GB mem)

I have written VBA in MS Word to generate a concordance for a book <=100k
words. Because of bookmark limitations in word (max ~16387) I have written
automation code to pipe data to an Excel spreadsheet, creating named ranges
on the fly - the name containing the range start and end for the target in
the word document. When opened, the workbook recovers the source doc name
from its own and creates a new automation session to goto words/phrases in
the word document based on the range names (and hence range start/end values)

Everything is fine for small documents, but having just processed the book
(24 hours run-time!) whilst no errors were encountered* (* see below), when
attempting to open the resulting 23MB Excel workbook, Excel fails to open the
file, going through a process of repair before deciding it's all hopeless
(for an intermediate snapshot of ~6MB the log file says eventually "Damage to
the file was so extensive that repairs were not possible. Excel attempted to
recover your formulas and values, but some data may have been lost or
corrupted." - yes, there's data but the workbook is functionally completely
messed up; for the full file I don't even get that!).

Question: specifications say that named ranges are limited only by memory,
but when running the automation xl didn't take more than ~50MB-ish (don't
remember exactly) but much less than the available RAM without even
considering virtual memory. However, given that I have approximately 200k
named ranges, is that the source of the problem? One sheet has the full
65536 rows and a full 256 columns for *some* rows (the others have ~8,000 and
45,000 rows respectively).

Any ideas? Am at my wits end... I tried doing the concordance in Word alone
in the first place, then going via XML output from Word into Excel before
direct automation links... I see no other way ahead!

Best regards

Jules

* there was an error early on when attempting to set cell value and name
range, but invoking xlDoc.save from word and resuming allowed execution to
proceed - MS please note!
 
M

Mark Lincoln

I wonder if the sheer number of names actually is the problem. While
you might have more than enough memory to create them, Excel has to
deal with them. It may be spending all those CPU cycles and time
trying to maintain an index.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top