lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitry Kan <dmitry.luc...@gmail.com>
Subject Re: o.a.l.a.payloads.DelimitedPayloadTokenFilter reset()/close() call missing
Date Thu, 23 Apr 2015 14:46:06 GMT
Hi Uwe,

I think it was my mistake in the code: in my lucene analyzer class I have
implemented the following method:

@Override
protected TokenStreamComponents createComponents(String fieldName,
Reader reader) {
    Tokenizer tokenizer = new WhitespaceTokenizer(reader);
    TokenStream result = new LowerCaseFilter(tokenizer);
    result = new DelimitedPayloadTokenFilter(result, '|', encoder);
    TokenStreamComponents tokenStreamComponents = new
TokenStreamComponents(new WhitespaceTokenizer(reader), result);
    return tokenStreamComponents;
}



It was a mistake to create WhitespaceTokenizer twice. The correct
implementation is:

@Override
protected TokenStreamComponents createComponents(String fieldName,
Reader reader) {
    Tokenizer tokenizer = new WhitespaceTokenizer(reader);
    TokenStream result = new LowerCaseFilter(tokenizer);
    result = new DelimitedPayloadTokenFilter(result, '|', encoder);
    TokenStreamComponents tokenStreamComponents = new
TokenStreamComponents(tokenizer, result);
    return tokenStreamComponents;
}



Sorry about the noise!



On 23 April 2015 at 17:14, Uwe Schindler <uwe@thetaphi.de> wrote:

> Of course!
>
> Do you have code to reproduce?
>
> Uwe
>
>
> Am 23. April 2015 15:54:06 MESZ, schrieb Dmitry Kan <
> dmitry.lucene@gmail.com>:
>>
>> Hi,
>>
>> In Lucene 4.10.4 the DelimitedPayloadTokenFilter class seems to violate
>> the contract of the TokenStream. Should I raise a jira? Thanks.
>>
>>
>>
>> java.lang.IllegalStateException: TokenStream contract violation:
>> reset()/close() call missing, reset() called multiple times, or subclass
>> does not call super.reset(). Please see Javadocs of TokenStream class for
>> more information about the correct consuming workflow.
>>     at org.apache.lucene.analysis.Tokenizer$1.read(Tokenizer.java:111)
>>     at
>> org.apache.lucene.analysis.util.CharacterUtils.readFully(CharacterUtils.java:241)
>>     at
>> org.apache.lucene.analysis.util.CharacterUtils$Java5CharacterUtils.fill(CharacterUtils.java:283)
>>     at
>> org.apache.lucene.analysis.util.CharacterUtils.fill(CharacterUtils.java:231)
>>     at
>> org.apache.lucene.analysis.util.CharTokenizer.incrementToken(CharTokenizer.java:148)
>>     at
>> org.apache.lucene.analysis.core.LowerCaseFilter.incrementToken(LowerCaseFilter.java:62)
>>     at
>> org.apache.lucene.analysis.payloads.DelimitedPayloadTokenFilter.incrementToken(DelimitedPayloadTokenFilter.java:55)
>>
>>
> --
> Uwe Schindler
> H.-H.-Meier-Allee 63, 28213 Bremen
> http://www.thetaphi.de
>

Mime
View raw message