lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manuel Le Normand <manuel.lenorm...@gmail.com>
Subject Optimization storage issue
Date Sat, 02 Mar 2013 17:24:45 GMT
My use-case is a casi-monthly changing index. Everyday i index few
thousands of docs and erase a similar number of older documents, whilst few
documents last in the index for ever (about 20 % of my index). After few
experiments, i get that leaving the older documents in the index (mostly in
the *.tim file) slows down significally my avg qTime and got to the
conclusion i need to optimize the index once every few days to get ride of
the older documents.

Optimization requires about 2 times more the index storage. As i have many
shards and one replica for each, and the optimization occurs simultaneously
for all, i need twice the amount of storage of my initial index size, while
half of it is used very unfrequently (optimization takes about an hour).

1) Is there a possibility of using a storage pool for all shards, so every
shard uses the spare storage in series, forcing the optimization to run
unsimultaneously. In this case all the storage i'd use would be (total
index storage + shard storage) instead of twice the total index storage.

2) When i run optimization for a replicated core, does it copy from its
leader or does it optimize independenly?

Thanks,
Manu

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message