Fastest methods for text searching.

D

David G

i'm in the process of implementing a database to replace an existing
spreadsheet method being used. I've normalized the tables and searches
based on the new tqable design is fast. There are a few many to
manygrelationships which I must preprocess to show the data in a form
that is familiar to users.

This is a typical instances.:
Multiple authors on any number of papers being researched and written.
(M-M).authors - books
I have a join table that connects the AuthorID with the BookID. Since
I;m displayin the records in continuous form view, I cant insert a
subforn listing records returned by a qryAuthorsForBook.
So, I have a cotrol on the main form with record control
=fBuildPhrase("Preparer")

fBuildPhrase handles all of the many to many joins by concotenating
The name fields end to end creatng a single text string containing all
the names associated by the many to many table

This part is relatively fast, nut is there a better way?

Next, I'mleft to deal with existing data that is not normalized and
has little fixed structure. We need to allow users to type in words to
be searched for in the titles, lines, and department fields. (I
managed to reword the department information into a form that could be
indexed and normalized,

My approach has been to create a global array variable which parses
the contents of the search words text box, then stores eaxh word in
one it's elemnets.

I have a second function that compares the keyword array against the
concotenated text in 'Title, and Department, and Line" fields. The
function returns 0,-1

Next I create a query involving bookID and
ISMATCH:=fKeywordFoundInPhrase(bookID) where ISMATCH = -1
which returns only BOOKID's with keywords in their titles. I use the
InString function to find any matches.

Theres aprox 2600 records in the main table. With no other criterial
selected except a single keyword, say "Shoe" the seach takes around 30
to 40 seconds, Adding a second or third word only increases the
duration by 8 to 10 seconds,

I've noticed that if
i have other criteria set (say for Author"), which is based on indexed
fields, the keyword searches are much shorter. (The recordsource for
the search form is an SQL statement that i build from the criteria
selections made on the form. Im thinking that the order of the items
in the SQL statment may affect the search duration. )

here's a typical SQL:
SELECT DISTINCT tblOPL.tblOPLID, tblOPL.tblOPLNumber,
tblOPL.tblOPLDate, tblOPL.tblOPLTitle,
tblDepartment.tblDepartmentName, tblOPL.tblDepartmentID,
tblOPL.tblOPLLine, tblOPL.tblOPLInactive FROM tblDepartment RIGHT JOIN
tblOPL ON tblDepartment.tblDepartmentID = tblOPL.tblDepartmentID WHERE
(((tblOPL.tblOPLID) In (SELECT jtblOPLPreparer.tblOPLID FROM
jtblOPLPreparer Where (((jtblOPLPreparer.tblPreparerID)=180)))) AND
((tblOPL.tblOPLID) In (SELECT qryKeywordInTitle.tblOPLID FROM
qryKeywordInTitle WHERE (((qryKeywordInTitle.MatchFound)=-1)))) AND
((tblOPL.tblOPLDate)>=#7/3/2002# And (tblOPL.tblOPLDate)<=#7/21/2010#)
AND ((tblOPL.tblDepartmentID)=3) AND ((tblOPL.tblOPLInactive)=0));

Would the relative location of a fields existance in the SQL affect
how long it takes a query to executr?

Are there better approaches to searching trough fields for matching
text?

Any thoughts or comments or criticizms would be appreciated.
Any comments are greatly appreciated!
 
A

a a r o n . k e m p f

the fastest way to search text is to use 'full text search' with sql
server... it is included with every edition of sql server except
compact edition
 
A

a a r o n . k e m p f

or, you could just use the built in FULL TEXT SEARCH and 'CONTAINS'
clause with SQL Server
 
D

David G.

Allen and Aaron:
Thanks for taking the time to comment.

I will be sticking with Access at this point. My plan moving forward
is to match the book titles with key words from a keyword list, in a
M-M configuration. Books entered in the future will searchable by
indexed keywords in a join table consisting of only 2 foreign keys. If
searching the records already in place becomes a big issue, they can
go back and assign key words, or I could work up a run-once function
to assign keywords based on text found in the title.

Thanks again!

or, you could just use the built in FULL TEXT SEARCH and 'CONTAINS'
clause with SQL Server
THANKS!
David G.
 
J

John W. Vinson

Allen and Aaron:
Thanks for taking the time to comment.

I will be sticking with Access at this point. My plan moving forward
is to match the book titles with key words from a keyword list, in a
M-M configuration. Books entered in the future will searchable by
indexed keywords in a join table consisting of only 2 foreign keys. If
searching the records already in place becomes a big issue, they can
go back and assign key words, or I could work up a run-once function
to assign keywords based on text found in the title.

I'd suggest that you save a link by making the primary key of the keywords
table the keyword itself (and using the keyword as a foreign key in the
junction table as well). An index on a short, unique, stable ASCII text string
(the keyword) will be compact and efficient; there's no benefit to indexing a
numeric key and bringing another table into the query!

By all means automate the assignment of keywords. This is a well-worn issue (I
worked tangentially with such a database, in Oracle) thirty years ago; I'm
sure you'll be able to find software that parses titles, discards non-key
words and punctuation, etc.

That said... this is one case where Aaron's idee fixe of SQL/Server may be
apropos; full text searching tools in SQL are very good indeed and if
performance is an issue, it's certainly worth examining. SQL used to be very
expensive and complex to manage, but MS is now making it much more accessible.
--

John W. Vinson [MVP]
Microsoft's replacements for these newsgroups:
http://social.msdn.microsoft.com/Forums/en-US/accessdev/
http://social.answers.microsoft.com/Forums/en-US/addbuz/
and see also http://www.utteraccess.com
 
A

a a r o n . k e m p f

David;

Have fun reinventing the wheel and wasting your time and everyone
elses.
SQL Server has this built in.

-Aaron
 
A

a a r o n . k e m p f

John;

SQL Server has been free for what, 12 years now? Since 1998?
I don't know where you come up with that BS.

Thanks

-Aaron
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top