REQ: Text index generator For Win98

M

Mike Bourke

Hi all,

I've been trying to track down a small app (not a full office suite!) that
will take an .rtf file and the page parameters (width & height) and generate
in plain text an index to the document. I don't care if it makes an entry
for "and" "but" and "the" - I can edit those out - I just want a listing of
every word in the document and what printed pages they will appear on.

The app must accept RTF files (to allow for different fonts and font sizes)
and run on Win98 or earlier - I can cope with Win95, Win3.1, or even a DOS
app!

I would prefer it if the page dimensions could be specified in mm or cm, but
can cope if the app only understands inches.

I tried a web search for such a product and found dozens of options for
generating indexes of the contents of a folder, or for generating index
pages for web sites, or for reading index.dat files, or converting plain
..txt to .rtf, or even for indexing every document on a hard drive for fast
searching - in fact, just about everything you can think of relating to the
word "index" except what I was looking for.

The office suites and more advanced programs generally force you to go
through the document manually selecting which words you want entered into
the index - open office does it that way, for example - but I think it would
be easier and lots faster to just delete one or more lines of index
containing words you don't care about. If I have to, I'll use OO but I'm
really looking for a more brute-force solution - probably an older, simpler
app.

Thanks in advance,

Mike Bourke.
 
W

William F. Adams

Such an automated tool as you describe creates a concordance, not an
index --- useful for some tasks, but not so much as an index. Virginia
System's Sonar Bookends does this, but it's not freeware. An index
should be a creative work in its own right, embodying apprpriate
synonyms and a coneptual structure for the topic at hand.

FWIW, It's trivial to search and replace in a .rtf to tag terms for
Word's (or OO's) indexing feature though --- I did this for the Naval
Institute Press's Combat Fleets of the World CD-ROM before pulling the
text into Macromedia RoboHelp to make the .html files.

William
 
M

Mike Bourke

William F. Adams said:
Such an automated tool as you describe creates a concordance, not an
index --- useful for some tasks, but not so much as an index.

Concordance (N.) = index of the principle words in a document. But I take
your point.
An index
should be a creative work in its own right, embodying apprpriate
synonyms and a coneptual structure for the topic at hand.

I think it would be easier to edit a list into appropriate shape.
FWIW, It's trivial to search and replace in a .rtf to tag terms for
Word's (or OO's) indexing feature though

Trivial = slow and menial. Which is why I was looking for a better solution.

But I appreciate the reply, in any event.
 
W

William F. Adams

I'd said:
and Mike Bourke replied:
I think it would be easier to edit a list into appropriate shape.

Such editing doesn't get one the wanted synonyms, nor does it easily
lead to a suitable conceptual structure --- indexing is hard, but it's
also a creative endeavour in its own right.
Trivial = slow and menial. Which is why I was looking for a better solution.

Trivial == easily scriptable / programmable

Here's how to do it:

- dump your .rtf to .txt
- run the list through a utilit like Unix's uniq to get a list of
discrete words
- (optional) sort and filter out terms which it doesn't make sense to
include
- batch edit the list into a sed or other script which replaces each
occurrence of the word with the word, the indexing tag, a duplicate of
the word, and the closing indexing tag
- apply the script to the source .rtf

Trivial.

William
 
A

Al Klein

Trivial = slow and menial. Which is why I was looking for a better solution.

There used to be a word processor - ca 1988 or 1989 - that did a
pretty good ob of building an index from a document. I've been trying
to remember the name all day, but I can't. It wasn't one of the
well-known ones, though. (But it ran under DOS or Win 386.)
 
D

David

There used to be a word processor - ca 1988 or 1989 - that did a
pretty good ob of building an index from a document. I've been trying
to remember the name all day, but I can't. It wasn't one of the
well-known ones, though. (But it ran under DOS or Win 386.)

Are you thinking of ZYIndex (or something close)?
 
A

Al Klein

Are you thinking of ZYIndex (or something close)?

No, I seem to recall that it started with a "W". Came on a whole
bunch of 5-1/4" floppies. Boxed. (But those last 2 apply to
everything of that era, I guess.)

We used it to create a book - Index, TofC, Chapter breaks, etc. If
anyone here worked at Citibank GSO in those days, please speak up. :)
Steve? Steph? Ed? John?
 
M

me

No, I seem to recall that it started with a "W". Came on a
whole bunch of 5-1/4" floppies. Boxed. (But those last 2
apply to everything of that era, I guess.)

We used it to create a book - Index, TofC, Chapter breaks,
etc. If anyone here worked at Citibank GSO in those days,
please speak up. :) Steve? Steph? Ed? John?

Wordperfect?

J
 
B

B. R. 'BeAr' Ederson

No, I seem to recall that it started with a "W". Came on a whole
bunch of 5-1/4" floppies. Boxed. (But those last 2 apply to
everything of that era, I guess.)

We used it to create a book - Index, TofC, Chapter breaks, etc.

<OT>WordStar combined with StarIndex did a good job these days.</OT>

BeAr
 
T

thoss

No, I seem to recall that it started with a "W". Came on a whole
bunch of 5-1/4" floppies. Boxed. (But those last 2 apply to
everything of that era, I guess.)
You're probably thinking of WordStar. But I don't agree that it wasn't
one of the well-known ones.

The problem here would be with rtf. You can convert from rtf to
WordStar format and generate the index, but there would be no guarantee
that the conversion would keep page breaks in the same place unless you
edited the file by replacing soft with hard page breaks.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top