lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicolás Lichtmaier <nicol...@wolfram.com.INVALID>
Subject Re: FlattenGraphFilter assertion error
Date Tue, 12 Mar 2019 18:57:10 GMT
I've created a Jira issue for this here: 
https://issues.apache.org/jira/browse/LUCENE-8723

El 8/3/19 a las 00:08, Nicolás Lichtmaier escribió:
>
> Oops, sorry... in that code there's a "camelCase" parameter that is 
> not implemented in normal Lucene. That is an option I've added for 
> better camel case support, but the bug happens without that option as 
> well.
>
> El 7/3/19 a las 20:33, Nicolás Lichtmaier escribió:
>>
>> After a lot of time... Here's an small example that triggers that 
>> assertion.
>>
>>             Builder builder = CustomAnalyzer.builder();
>>
>> builder.withTokenizer(StandardTokenizerFactory.class);
>> builder.addTokenFilter(WordDelimiterGraphFilterFactory.class, 
>> "camelCase", "1", "preserveOriginal", "1");
>>             builder.addTokenFilter(StopFilterFactory.class);
>>
>> builder.addTokenFilter(FlattenGraphFilterFactory.class);
>>             Analyzer analyzer = builder.build();
>>
>>             TokenStream ts = analyzer.tokenStream("*", new 
>> StringReader("x7in"));
>>             ts.reset();
>>             while(ts.incrementToken())
>>                 ;
>>
>> This gives:
>>
>> Exception in thread "main" java.lang.AssertionError: 2
>>     at 
>> org.apache.lucene.analysis.core.FlattenGraphFilter.releaseBufferedToken(FlattenGraphFilter.java:195)
>>     at 
>> org.apache.lucene.analysis.core.FlattenGraphFilter.incrementToken(FlattenGraphFilter.java:258)
>>     at com.wolfram.textsearch.AnalyzerError.main(AnalyzerError.java:32)
>>
>> It's the interaction between WordDelimiterGraphFilter and stop word 
>> removal, it seems, that trigger an assertion when flattening.
>>
>>
>> El 12/10/17 a las 19:18, Michael McCandless escribió:
>>> Hmm, that's not good!  Clearly there is a bug somewhere.
>>>
>>> Are you able to isolate a small example, e.g. text input and 
>>> synonyms you fed to SynonymGraphFilter, to show this assertion trip?
>>>
>>> Are you using any custom analysis components before the 
>>> FlattenGraphFilter?
>>>
>>> Mike McCandless
>>>
>>> http://blog.mikemccandless.com
>>>
>>> On Tue, Oct 10, 2017 at 11:24 AM, Nicolás Lichtmaier 
>>> <nicolasl@wolfram.com <mailto:nicolasl@wolfram.com>> wrote:
>>>
>>>     Hi!
>>>
>>>     I was getting an exception in FlattenGraphFilter and, as I saw
>>>     there was assertion statements nearby, I reran everything with
>>>     assertions enabled. And I see it crashes here
>>>     (FlattenGraphFilter.java:174)
>>>
>>>
>>>     At this point inputNode has all fields with -1 (except nextOut,
>>>     which is 0).. and outputFrom's value is 395.
>>>
>>>     The code is pretty complex, so before trying to undestand it I
>>>     thought maybe someone could know what's happening just seeing
>>>     this, maybe not. =)
>>>
>>>     I'll keep the debugging session open for a while in case some
>>>     more variables could be useful to debug this.
>>>
>>>     Thanks!
>>>
>>>
>>>

Mime
View raw message