lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jyothsna Bavisetti <jyothsna.bavise...@oracle.com>
Subject Need suggetion in replacing forcemerge(1) with alternative which consumes less space
Date Tue, 14 Apr 2020 05:55:49 GMT
Hi,

  

1.We Upgraded Lucene 4.6 to 8+, After upgrading we are facing issue with Lucene Index Creation.

We are indexing in Multi-threading environment. When we create bulk indexes , Lucene Document
is getting corrupted. (data is not getting updated correctly. Merging of different row data).

2. when we are trying to updateDocument method for single record. It is not reflecting in
IndexReader until the count is 8.  Once the count exceeds, than records are visible for IndexReader.
(creating 8 segment files.) is there any alternative for reducing these segment file creation.
 

3. above two issues are resolved by forcemerge(1). But it is not feasible for our use case
, because it takes 3X memory. We are creating indexes for huge data.

 

4. IndexWriter Config:
analyzer=com.datanomic.director.casemanagement.indexing.AnalyzerFactory$MA

ramBufferSizeMB=64.0

maxBufferedDocs=-1

mergedSegmentWarmer=null

delPolicy=com.datanomic.director.casemanagement.indexing.engines.TimedDeletionPolicy

commit=null

openMode=CREATE_OR_APPEND

similarity=org.apache.lucene.search.similarities.BM25Similarity

mergeScheduler=ConcurrentMergeScheduler: maxThreadCount=-1, maxMergeCount=-1, ioThrottle=true

codec=Lucene80

infoStream=org.apache.lucene.util.InfoStream$NoOutput

mergePolicy=[TieredMergePolicy: maxMergeAtOnce=10, maxMergeAtOnceExplicit=30, maxMergedSegmentMB=5120.0,
floorSegmentMB=2.0, forceMergeDeletesPctAllowed=10.0, segmentsPerTier=10.0, maxCFSSegmentSizeMB=8.796093022207999E12,
noCFSRatio=0.1, deletesPctAllowed=33.0

indexerThreadPool=org.apache.lucene.index.DocumentsWriterPerThreadPool@24348e05

readerPooling=true

perThreadHardLimitMB=1945

useCompoundFile=false

commitOnClose=true

indexSort=null

checkPendingFlushOnUpdate=true

softDeletesField=null

readerAttributes={}

writer=org.apache.lucene.index.IndexWriter@23a84a99

 

Please suggest some ideas alternate of forceMerge, dealing with indexwriter.commit for multithreading,
committing  data while updating single record.

 

 

Thanks,

Jyothsna

 

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message