lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <>
Subject Re: TokenStream and Token APIs
Date Sun, 19 Oct 2008 15:25:24 GMT
Grant Ingersoll wrote:
> On Oct 19, 2008, at 12:56 AM, Mark Miller wrote:
>> Grant Ingersoll wrote:
>>> Bear with me, b/c I'm not sure I'm following, but looking at 
>>>, I see at least 5 
>>> different implemented Attributes.
>>> So, let's say I add a 5 more attributes and now have a total of 10 
>>> attributes. Are you saying that I then would have, potentially, 10 
>>> different variables that all point to the token as in the code 
>>> snippet above where the casting takes place? Or would I just create 
>>> a single "Super" attribute that folds in all of my new attributes, 
>>> plus any other existing ones? Or, maybe, what I would do is create 
>>> the 5 new attributes and then 1 new attribute that extends all 10, 
>>> thus allowing me to use them individually, but saving me from having 
>>> to do a whole ton of casting in my Consumer.
>> Potentially one consumer doing 10 things, but not likely right? I 
>> mean, things will stay logical as they are now, and rather than a 
>> super consumer doing everything, we will still have a chain of 
>> consumers each doing its own piece. So more likely, maybe something 
>> comes along every so often (another 5, over *much* time, say) and 
>> each time we add a Consumer that uses one or two TokenStream types. 
>> And then its just an implementation detail on whether you make a 
>> composite TokenStream - if you have added 10 new attributes and see 
>> it fit to make one consumer use them all, sure, make a composite, 
>> super type, but in my mind, the way its done in the example code is 
>> clearer/cleaner for a handful of TokenStream types. And even if you 
>> do make the composite,super type, its likely to just be a sugar 
>> wrapper anyway - the implementation for say, payload and positions, 
>> should probably be maintained in their own classes anyway.
> Well, there are 5 different attributes already, all of which are 
> commonly used.  Seems weird to have to cast the same var 5 different 
> ways.  Definitely agree that one would likely deal with this by 
> wrapping, but then you end up either needing to extend your wrapper or 
> add new wrappers...
Okay, I see, all of that is going to happen in one Consumer; your not 
going to want to read the TokenStream more than once. I see your point now.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message