lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: BaseTokenStreamTestCase
Date Fri, 16 May 2014 23:21:55 GMT
Hi,

 

you have to capture state on the first token before inserting new ones. When inserting a new
token, *solely* call restoreState(); clearAttributes() is not needed before restoreState().

If you don’t do this, your filter will work incorrect if other filters come *after* it.

 

The assertion in BaseTokenStreamTestCase is therefore correct and really mandatory. There
are many filters that show how to do this token inserting correctly.

 

Uwe

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

http://www.thetaphi.de <http://www.thetaphi.de/> 

eMail: uwe@thetaphi.de

 

From: Nitzan Shaked [mailto:nitzan.shaked@gmail.com] 
Sent: Friday, May 16, 2014 6:28 AM
To: dev@lucene.apache.org
Subject: BaseTokenStreamTestCase

 

Hi all

 

While writing the unit tests for a new token filter I came across an issue(?) with BaseTokenStreamTestCase.assertTokenStreamContents():
it goes to some length to assure that clearAttributes() was called for every token produced
by the filter under test.

 

I suppose this helps most of the time, but my filter produces sometimes more than 1 output
token for a given input token. I don't want to care about what attributes the input token
carries, and so don't clear attributes between producing the output tokens from a given input
token: I only change the attributes I care about (in my case this is charTerm right now, and
nothing else, not even positionIncrement).

 

This makes my unit tests unable to use BaseTokenStreamTestCase.assertTokenStreamContents().
I certainly do not want to add a captureState() and "clearAttributes() ; restoreState() "
calls just so I can pass the unit tests.

 

I would rather change assertTokenStreamContents to support my use case, by adding a boolean
and making the required changes everywhere else.

 

Thoughts?

Nitzan

 


Mime
View raw message