Honestly, I'm with Chip on this - why use something not suited for
making an index when Word has built-in features to create one, that
will update dynamically if the book you're working with changes. But,
just for kicks, here's the process.
Create a new Excel worksheet. Label the first column "First Letter",
second column "Word", and the next couple of columns with this
formula:
= "Page number "&COLUMN()-2
Go through the book, looking at each word. If it's the first time you
encounter the word, type it in the second column of your Excel
worksheet, marking the page number in the second column. The second
time you encounter this word, find this line, and use the third column
to mark the page number of its second occurence. Next, sort the list
alphabetically. In the first column, use this formula, starting from
the second row:
=LEFT(A2)
This will give you the first letter of each word. Now, create a
PivotTable. Use these columns for Row values, in this order:
First Letter, Word, Page #1, Page #2, Page #3 etc however many
occurences you want to spot.
Finally, save the resulting Pivot sheet as a .CSV file. Open it in
notepad and perform the following Replace operation until you find 0
occurences:
Find What: ",,"
Replace with What: ","
There's your index. You might try using a VBA procedure for this
process, so as to avoid too much manual work. I'm not familiar with
Word's VBA, and I don't have time to figure out how to find the page
number of a particular Range object. But, here's some code that will
count the number of occurences of each word 3 characters or longer.
Put this in the Workbook module of a worksheet that has a Sheet1.
Make a copy of your book and save it as "c:\book.doc" (or change the
filename in the wrd.Open() line). Be sure to include a reference to
the Word object model, in the Excel VBA project.
Public Sub makeIndex()
Dim wrd As Word.Application
Dim wrDoc As Word.Document
Dim xlRange As Excel.Range
Dim xlRangeFound As Excel.Range
Dim lineCounter As Long
Dim columnCounter As Long
On Error Resume Next
Set wrd = New Word.Application
Application.ScreenUpdating = False
wrd.ScreenUpdating = False
wrd.Visible = False
Set wrDoc = wrd.Documents.Open("c:\book.doc")
Dim myWord As Word.Range
lineCounter = 2
Sheet1.UsedRange.Clear
Set xlRange = Sheet1.Columns(2)
For Each myWord In wrDoc.Words
If Len(Trim(myWord.Text)) >= 3 Then
Set xlRangeFound = xlRange.Find(myWord.Text, , xlValues)
If (xlRangeFound Is Nothing) Then
Sheet1.Cells(lineCounter, 2).Value = myWord.Text
Sheet1.Cells(lineCounter, 3).Value = 1
lineCounter = lineCounter + 1
Else
Sheet1.Cells(xlRangeFound.Row, 3).Value =
Sheet1.Cells(xlRangeFound.Row, 3).Value + 1
End If
End If
DoEvents
Next myWord
wrd.Quit (False)
End Sub
After the run finishes - about 3 minutes for a 16,000 word document -
you'll have to add in the column headers.