[s-cars] search engine installed for testing

Brett Dikeman brett.dikeman at gmail.com
Sat Mar 21 11:05:05 PDT 2009


I've installed a search engine and the form is located on the main
page below the google search bar.  It performed a crawl of the
Knowledgebase, old archives, and current pipermail archives last
night/early this morning.   Give it a go and see what you come up
with, compare to Google, etc- coverage, results quality, etc.  I'm all
ears.  This is supposed to be the best the open-source community has
to offer.

Similar to google, it supports phrase matching via quotes and blocking
results with a minus sign.  For full usage tips, make sure to check
out the help link in the search results page (I'll add it to the main
page soon.)

 It only crawled 134,000 files; there more than that in just the
1991-1999 archives.  It still seems to return more comprehensive
results. "Alternator" turns up twice as many hits as Google, and
you're far more likely to find current content.  Google is very
hit-or-miss in this regard.

With the current version, re-crawling is difficult (long story) so the
index might be up to roughly a week out of date, but the Nutch project
is gearing up for a new release which supports a backend database that
makes re-crawls easier.  If anyone is familiar with the
Lucerne/Nutch/Solr set of tools and can help with tweaking the
installation or answering questions, that would be appreciated; the
documentation on Nutch leaves much to be desired.

I'm also open to suggestions WRT free, open-source alternatives (which
comprise a complete package, ie crawler/recrawler, search index,
database, and web front-end.)

Brett


More information about the S-CAR-List mailing list