Libraries for Index, Search and Retrieval

  • Thread starter Thread starter Ahmed Qurashi
  • Start date Start date
A

Ahmed Qurashi

Does anyone know of any Libraries using .NET for the Indexing, Search, and
Retrieval of Text Documents? I am searching for API's to analyze a set of
docs and convert text into indexable tokens, create indices and then harvest
query results from those indicies.

Similar to Apache's Jakarta Lucene project: a high-performance,
full-featured text search engine library.

http://jakarta.apache.org/lucene/docs/api/index.html

I am interested in experimenting with Applications of Search Engine
Technology to build a robust, laser-efficient engine for Object Persistence.
Typically, object persistence is handled via SQL or XML backend solutions,
and for large datasets can involve long wait times for complex queries.

Already, there exists such a solution in the Java world: MAOS
(Meta-Attribute Object Store) is a light-weight Java library / framework
implementing simple Object persistence using search-engine technology.

http://sourceforge.net/projects/maos/

If you know of something similar in the C# vein or would be interested in
starting a project to create a native .NET solution for index, search and
retrieval, please respond here.

ok,
aq
 
Yes, thanks for the reply, I also found this right after I posted!

Its exactly what I am looking for. The entire Lucene library is ported to C#
and exposed via a single assembly: Lucene.Net.dll
It just doesn't get any easier than that...

Incidentally, while I was doing some research I found several commercial
products for indexing xml and other sgml data. These products can run to
$3500 per license. SearchBlackBox, the product originating from the original
port of Lucene to .NET, sells for about $500. So, there is definitely some
viability for commercial products using search engine tech, including an
object persistence API and product.

If anyone in interested in starting a project like this, drop me a line...

ok,
aq
 
Aq, I am shocked you were able to find something as low as $3500. We have
used verity and paid about $350,000 as licensing fee, and others an range
anywhere between that to as high as a million dollars for 2 years for
google. (We need heavy duty customization). My hands are too tied up right
now with a new book I am writing or I'd have loved to join a project with
you.

- Sahil Malik
http://dotnetjunkies.com/weblog/sahilmalik
 
Back
Top