lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Itai Peleg <pelegita...@gmail.com>
Subject Re: adding attributes to TokenStream
Date Tue, 01 Jan 2013 21:24:41 GMT
That worked great :) thanks a lot for the quick reply!

I have another question - after I "flagged" all my special tokens (in my
case, the ones that are entities) is there an elegant way of counting how
many of them I have in a document? I found an ugly way to do that, but I'm
sure there's a better one.

Thanks in advance,
Itai


2012/12/31 Michael Sokolov <sokolov@ifactory.com>

> On 12/31/2012 11:39 AM, Itai Peleg wrote:
>
>> Hi all,
>>
>> Can someone please post a simple example showing how to add additional
>> attributes to token in a TokenStream (inside IncrementToken for example?).
>>
>> I'm working on entity extraction and want to flag specific tokens an
>> entities, but I'm having problems.
>>
>> Thanks in advance,
>> Itai
>>
>>  Here's a simple example of a filter that adds an atytribute saying
> whether a token is "the"
>
> class YourTokenStream extends TokenFilter {
>   private final YourAttribute att;
>   private final CharTermAttribute term;
>   private final TokenStream source;
>
>   public YourTokenStream (TokenStream upstream) {
>      att = addAttribute (YourAttribute.class);
>      term = addAttribute (CharTermAttribute.class);
>      source = upstream;
>   }
>
>   public boolean incrementToken () {
>     if (source.incrementToken()) ?? {
>       if ("the".equals (new String(term.buffer())) {
>         att.setIsAnEnglishArticle(**true);
>         return true;
>     }
>     return false;
>   }
>
> }
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message