lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Worms <>
Subject Re: [LARM] next steps
Date Sat, 01 Feb 2003 00:52:50 GMT

On Friday, January 31, 2003, at 03:48  PM, Clemens Marschner wrote:
> Great, so how should we go on?
> I suggest we wait for you, David, so that you can make the code a 
> little
> more stable and change the things you mentioned. You said something 
> about
> two weeks (?)

Two weeks is the time I need to become more familiar with the crawler, 
setup some config, try Merlin, and get a deeper look at the excalibur 
event package. At this time, I could send a similar but cleaner code.

> I would say we should then be at a point where we could get rid of
> de.lanlab.* packages and move the rest to something like 
> org.apache.larm and
> then put it into the sandbox.

or maybe

> We should also check if performance is a problem, especially with those
> factory methods.

We could easily avoid the message factories and use regular constructor.

> Within this time we should also review the docs and adapt the 
> LARM-speak
> (MessageHandler? MessageListener? MessageProcessor? Stage? Storage?)
> After this time I would like to concentrate on two things:
> The next thing I would like to do is to break up FetcherTask into at 
> least
> two pieces (move parsing out) and change Messages such that they 
> contain
> lists of URLs. This means StoragePipeline becomes a ProcessingPipeline.
> The other big issue I would like to take care of is the
> URLVisitedFilter/URLVisitedManager/URLSeenFilter. Its RAM usage must be
> optimized. I already have some ideas for that.
> That's only my part; I know that Otis, Kelvin and Peter wanted to work 
> on
> other parts. May I suggest we all become familiar with David's work 
> and read
> our docs once again?
> Btw, I also learned a lot from your code, David.
> Cheers,
> Clemens

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message