Extracting keywords from a string

R

Raines95

This question was posted a few years ago and I am essentially looking for the
same result. I want to pull out keywords and based upon their frequency and
relevance, identify problem trends. The VBA code provided a few years ago by
Ken Sheridan was this:
''''Module starts''''
Option Compare Database
Option Explicit

Function GetLongWords(ByVal strText As String, intWordLength As Integer) As
String

Dim intSpacePos As Integer
Dim strWord As String, strWordList As String

intSpacePos = 0

' replace any double spaces with single space
strText = Replace(strText, " ", " ")

' loop through text and identify each word,
' assuming a word is terminated by a space or end of string
Do While True
intSpacePos = InStr(strText, " ")
If intSpacePos > 0 Then
strWord = Left$(strText, intSpacePos - 1)
' remove any punctuation form end of word
strWord = TrimWord(strWord)
If Len(strWord) >= intWordLength Then
' following <If> and <End If> only necessary if you want to
' look words up in Keywords table
If Not IsNull(strWord, "Keywords" Then
strWordList = strWordList & ", " & strWord
End If
End If
' trim word off text
strText = Mid$(strText, intSpacePos + 1)
Else
' word must be last in text so
strWord = TrimWord(strText)
If Len(strWord) >= intWordLength Then
' following <If> and <End If> only necessary if you want to
' look words up in Keywords table
If Not IsNull(strWord, "Keywords" Then
strWordList = strWordList & ", " & strWord
End If
End If
Exit Do
End If
Loop
' remove leading comma and space
GetLongWords = Mid$(strWordList, 3)

End Function


Private Function TrimWord(strWord As String) As String

' remove any punctuation characters from word
Do While True
Select Case Right$(strWord, 1)
Case ".", ",", ";", ":", "?", "!"
strWord = Left$(strWord, Len(strWord) - 1)
Case Else
Exit Do
End Select
Loop

TrimWord = strWord

End Function
''''Module ends''''

I am using Access 2007 and when I first used this code, it seemed like it
was stuck in the loop as Access would just sit there forever. Now when I run
it, it returns no values. I have confirmed that there are words meeting the
length I am looking for. Is there something new in Access 2007 that would
require changes to this code?
 
D

Duane Hookom

I would remove the $ from all string function like:
Right$(strWord, 1)
to
Right(strWord, 1)
This line is also wonky:
If Not IsNull(strWord, "Keywords" Then
I'm not sure what belongs there but it is clearly missing a ).
 
R

Raines95

I removed all the "$" from the code and it is still returning no values. The
code involving "Keywords" was removed at the beginning as I was not using a
Keywords table.
 
D

Duane Hookom

I'm not sure what your current code is and what you are attempting to do if
you don't have a key words table. Does the code compile?
 
R

Raines95

Here is the code I am using, and the code does compile, I am also using the
DAO 3.6 Object Library:

Option Compare Database
Option Explicit

Function GetLongWords(ByVal strText As String, intWordLength As Integer) As
String

Dim intSpacePos As Integer
Dim strWord As String, strWordList As String

intSpacePos = 0

' replace any double spaces with single space
strText = Replace(strText, " ", " ")

' loop through text and identify each word,
' assuming a word is terminated by a space or end of string
Do While True
intSpacePos = InStr(strText, " ")
If intSpacePos > 0 Then
strWord = Left(strText, intSpacePos - 1)
' remove any punctuation form end of word
strWord = TrimWord(strWord)
If Len(strWord) >= intWordLength Then
End If
' trim word off text
strText = Mid(strText, intSpacePos + 1)
Else
' word must be last in text so
strWord = TrimWord(strText)
If Len(strWord) >= intWordLength Then
End If
Exit Do
End If
Loop
' remove leading comma and space
GetLongWords = Mid(strWordList, 3)

End Function
Private Function TrimWord(strWord As String) As String

' remove any punctuation characters from word
Do While True
Select Case Right(strWord, 1)
Case ".", ",", ";", ":", "?", "!"
strWord = Left(strWord, Len(strWord) - 1)
Case Else
Exit Do
End Select
Loop

TrimWord = strWord

End Function
 
D

Duane Hookom

I still don't understand your objective. The function returns the value of
strWordList but you aren't doing anything in your code to build this string.
It begins as "" and ends as "".
 
R

Raines95

What I'm trying to do is search through a field (typically a memo field) in a
table and have it output all words longer than 5 characters in that field
(initially 5 characters, but will probably change with time). For example:
The system reported an unhandled exception at point...., it would then return
the following words: system, reported, unhandled, exception, point. I will
then use that data for further analysis and try to discover possible trends.
At some point I do forsee building a keyword table, but at this initial stage
I am dealing with a large amount on information and am trying to find an
easier way to break it down.
 
D

Duane Hookom

As I stated previously, you aren't updating strWordList. Modify the code to
insert the new line below

End If
strWordList = strWordList & ", " & strWord 'new line
Loop
 
R

Raines95

I had to do a little tweaking, but it's working properly now. Thank you very
much for your help.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top