lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Grant Ingersoll" <>
Subject Re: Dmitry's Term Vector stuff, plus some
Date Tue, 17 Feb 2004 19:59:20 GMT

I agree with your assessment about getting it right the first time.  I can make the changes,
as I don't think they are that involved and it will benefit me and my employer in the long
run if the changes are committed since we won't have reapply the patches every time there
is a new release.  

It would really speed things up if you can point me to examples of writing the version number
(and the logic for ignoring someting of the wrong version) and the compressed format.


>>> 02/17/04 12:37PM >>>
Grant Ingersoll wrote:
> Was wondering if you consider your comments on the Term vector stuff to be a show stopper
or not?  There hasn't been much response to your questions, so I wanted to bring it up again,
as I do not want to see this go the way of the last attempt.

I proposed three changes:

   1. add a format version number to the new file formats, so that they 
can be altered back-compatibly;
   2. use a more compressed format for term vectors;
   3. don't read positional information unless it is asked for.

I'd like to see all of these fixed before a 1.4 release.  So which 
should be fixed before things are first committed?  That's a tricky 

Once something's committed it's hard to remove, and it's also hard to be 
sure that any more work will be done on it.  Thus it's safest to only 
commit things that are just about release-ready.  Exceptions can be made 
when a developer can commit to continued work.  Are you committed to 
completing these changes in a timely manner, e.g., in the next few months?

I think change (1) is essential before anything is committed, to avoid 
breaking folks when the format does change.  However, if (2) is 
addressed before the 1.4 release, then there will be back-compatibility 
code in the 1.4 release, in order to be able to read the current format, 
even though it was never released.  That would be unfortunate.  So it 
would be best if both (1) and (2) were fixed before things are first 
committed.  As for (3), it could probably wait a bit.

Am I setting the bar too high here?  I really appreciate that you've 
done all this work, and I'm eager to get it committed.  This is a very 
sought-after feature.  But I don't want to commit something that's not 
quite ready.

Perhaps you feel you've done your share already, and want others to pick 
up the slack, fixing things like those named above.  If that's the case 
then perhaps we should go ahead and commit your changes as-is, and hope 
that others polish things a bit before a 1.4 release.  I'd prefer not to 
operate that way, but that might be our only option.


To unsubscribe, e-mail: 
For additional commands, e-mail: 

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message