lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From andi rexha <a_re...@hotmail.com>
Subject Getting new token stream from analyzer for legacy projects!
Date Fri, 12 Dec 2014 15:29:33 GMT
Hi, 
I have a legacy problem with the token stream. In my application I create a batch of documents
from a unique analyzer (this due to configuration). I add the field using the tokenStream
from the analyzer(for internal reasons). In a pseudo code this translates in : 

Analyzer analyzer = getFromConfig();
Collection docsToIndex; 


for (int i = 0; i < batchSize(); i ++) {
      Documet doc = new Document();
      doc.add(field, analyzer.tokenStream("fieldName", currentReader));
      docsToIndex.add(doc);
}

for (Document d : docToIndex) {
    indexWriter.add(doc);
}



I get always an exception :
TokenStream contract violation: reset()/close() call missing, reset() called multiple times....

I understand that the analyzer creates one TokenStream per thread and that the TokenStream
is used from the DefaultIndexingChain during the add documents, so the TokenStream is shared.


Is there a clean way I can overcome this problem? One possible way of course would be to get
the token stream from a separate thread, but that would be a dirty solution. 


 		 	   		  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message