lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Halácsy Péter <>
Subject lucene & avalon (was: Proposal for Lucene / new component)
Date Tue, 02 Apr 2002 19:55:29 GMT
more than 1 month ago  I promissed to write an avalon example application.
Now in my project I need some avalon components so I "avalonized" lucene. I
published the package as a zip file:

The main idea is to make two manager component one for Searches and one for
Writers. This is something similar to DataSource/DriverManager and
Interface of two main components:
public interface SearcherManager extends Component {
    public Searcher getSearcher();

public interface IndexWriterManager extends Component {
    public IndexWriter getWriter(boolean create);

You can configure:
1. exactly which implementing Manager class to use (I implemented two
SearcherManager: IndexSearcher and MultiSearcherManager and one
2. in my implementation you can configure which directory to use and for
writer mergeFactor, maxDocs ...

I rewrited to demo files: SearchFiles and IndexFiles to use my components.
You can compile and try it.

In my project I have two indexes in two different (filesystem) directory. I
have three Searchers:
1. one for directory I.
2. one for directory II
3. MultiSearcher

To configure this I have to write a config file:
<components logger="core">
   <!-- standard analyzer of lucene -->
        <standard name="standard">  <stopwords>
    <directory-searcher name="topics">
     <directory-searcher name="messages">
    <multi-searcher name="multi">

    <directory-writer name="topics">
    <directory-writer name="messages">



Why is it good for me:
1. because I can hide the implementation details from the application
2. I can confugre the system via config files
3. my logging system is ready to use (provided by apache logkit)
4. I can change the component's implementation without modification the code
(I'll change the analyzer because the standard lucene analyzer can't work
with ISO-8859-2 characters [I'll check it tomorrow])

I have to work on a better SearcherManager. We know that several thread can
reuse the same IndexReader but it should be closed and reopened when the
directory is modified. My problem is: i
Thread-1 gets an searcher and Thread-2 gets an other searcher; the two
Searcher uses the same IndexReader. Thread-1 has finished it's work and
close it. The Searcher will close the IndexReader that is used by Thread-2.
I think I've to implement something similar to (SQL) connection cache.

Thread 1 uses Searcher that uses an instance of CachedIndexReader. If
Thread-1 closes the cachedIndexReader it doesn't close the physical
IndexReader only notify the cache that it's close method was called.

Notice that we don't need to change the SearcherManager interface so I can
plug in new implementation (to be honest this kind of Manager classes could
be used without avalon: this is simply a use of abtract factory design

Somethind other:
 how about an IndexWriter called BatchIndexWriter that uses a RAMDirectory
to buffer documents to add to the index:
// sketch
public void addDocument(Docuement d) {
    if(count > aLimit) {
        ramWriter = new IndexWriter(new RAMDirectory());
       count = 0;

of course value of limit could be configured


ps: good tutorial:


RE: Proposal for Lucene / new component
From: Andrew C. Oliver
Subject: RE: Proposal for Lucene / new component
Date: Sun, 03 Mar 2002 11:48:27 -0800

> I think if you need logging, configuring, threading, pooling (for the
crawler) and
>want to be component based you need a framework some thing like avalon. It
took one
>day to understand Avalon and write the first Hello world application but
you can save
>a lot of time while coding.

Great!  Can you post your work to get the Hello Avalon App somewhere?
If you could document along those lines as well then I'll be happy to go
and write a "getting started" guide for Avalon.

I'm not objecting to using Avalon provided I can actually understand
it.  I'm really close thanks to the fine work of Ken Barrozzi
(, but
I'm one step away from actually being about to start using Avalon.  Its
not a "I won't" its an "I can't" issue.

To unsubscribe, e-mail:   <>
For additional commands, e-mail: <>

View raw message