lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-5202) LookaheadTokenFilter consumes an extra token in nextToken
Date Sun, 08 Sep 2013 10:27:51 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13761246#comment-13761246
] 

Michael McCandless commented on LUCENE-5202:
--------------------------------------------

Oh, sorry, I see; I indeed thought you were trying to create new tokens (and, changed the
test to do so).

OK, so for your first case (just changing attrs based on looked-ahead tokens), afterPosition
is not the right place to do that: this method is effectively called after the last token
leaving the current position has been emitted, and before setting attrs to the state for the
next token.  It's basically "between" tokens.

If you just want to change the att values, I think you should do that in your incrementToken,
i.e. it would first call nextToken(), and if that returned true, it would then futz w/ the
attrs and return true.  Would that work?
                
> LookaheadTokenFilter consumes an extra token in nextToken
> ---------------------------------------------------------
>
>                 Key: LUCENE-5202
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5202
>             Project: Lucene - Core
>          Issue Type: Bug
>    Affects Versions: 4.3.1
>            Reporter: Benson Margulies
>         Attachments: LUCENE-5202.patch, LUCENE-5202.patch
>
>
> This is a bit hard to explain except by looking at the test case. I've coded a filter
that uses LookaheadTokenFilter. The incrementToken method peeks some tokens. Then, it seems,
nextToken in the Lookahead class calls peekToken itself, which seems to me to consume a token
so that it's not seen when the derived class sets out to process the next set of tokens.
> In passing, this test case can be used to demonstrate that it does not work to try to
use the afterPosition method to set up attributes of the token that we're 'after'. Probably
that was never intended. However, I'm hoping for some feedback as to whether the rest of the
structure here is as intended for subclasses of LookaheadTokenFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message