lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki>
Subject Re: Index pruning
Date Wed, 13 Jun 2012 14:19:46 GMT
On 30/05/2012 03:39, Greg Bowyer wrote:
> Hi all
> I am playing about with the index pruning contrib package, I want to see
> if it will make a faster and slightly smaller index for me. However when
> I try either Carmel or RIDF methods it just ends up deleting all my
> postings for the two fields of interest.
> My command line for RIDF is as follows, any ideas what I could be doing
> wrong ?
> java -cp
> ./lucene-pruning-3.6.1-sz_release.jar:./lucene-core-3.6.1_sz_release.jar
> org.apache.lucene.index.pruning.PruningTool \
> -del title:pPsv,descr:pPsv \
> -in ./index -out ./pruned-index2 \
> -impl ridf -t -0.1

Hey Greg,

Sorry for a late answer ... the field spec string "pPsv" configures the 
PruningPolicy to completely delete the postings, so it's doing what you 
told it to do ;) The actual pruning would have happened at a later 
stage, but since the postings are removed first there is nothing to prune.

If your intention was to apply pruning only to selected fields then 
there is no command-line option for this in the tool - however, it's 
easy to add it, because the *Policy implementations usually take a 
Map<String,Number> to specify thresholds, where keys are either field 
names of field:term pairs.

Best regards,
Andrzej Bialecki, blog
  ___.,___,___,___,_._. __________________<><____________________
[___||.__|__/|__||\/|: Information Retrieval, System Integration
___|||__||..\|..||..|: Contact: info at sigram dot com

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message