lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <>
Subject RE: BaseTokenStreamTestCase
Date Fri, 16 May 2014 23:21:55 GMT


you have to capture state on the first token before inserting new ones. When inserting a new
token, *solely* call restoreState(); clearAttributes() is not needed before restoreState().

If you don’t do this, your filter will work incorrect if other filters come *after* it.


The assertion in BaseTokenStreamTestCase is therefore correct and really mandatory. There
are many filters that show how to do this token inserting correctly.





Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen <> 



From: Nitzan Shaked [] 
Sent: Friday, May 16, 2014 6:28 AM
Subject: BaseTokenStreamTestCase


Hi all


While writing the unit tests for a new token filter I came across an issue(?) with BaseTokenStreamTestCase.assertTokenStreamContents():
it goes to some length to assure that clearAttributes() was called for every token produced
by the filter under test.


I suppose this helps most of the time, but my filter produces sometimes more than 1 output
token for a given input token. I don't want to care about what attributes the input token
carries, and so don't clear attributes between producing the output tokens from a given input
token: I only change the attributes I care about (in my case this is charTerm right now, and
nothing else, not even positionIncrement).


This makes my unit tests unable to use BaseTokenStreamTestCase.assertTokenStreamContents().
I certainly do not want to add a captureState() and "clearAttributes() ; restoreState() "
calls just so I can pass the unit tests.


I would rather change assertTokenStreamContents to support my use case, by adding a boolean
and making the required changes everywhere else.





View raw message