lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mikhail Khludnev <m...@apache.org>
Subject Re: Reusable tokenstream
Date Wed, 22 Nov 2017 11:36:50 GMT
Roxana,
Have you seen my response in "tokenstream reusable" thread?
reusableTokenStream(java.lang.String
<https://lucene.apache.org/core/3_0_3/api/core/org/apache/lucene/analysis/Analyzer.html#reusableTokenStream(java.lang.String>,
doesn't help you. TokenStream is stateless, it holds the attributes for the
current token only.
Anyway, it resetted before it's returned for later reuse - it can't carry a
state.

On Wed, Nov 22, 2017 at 1:43 PM, Roxana Danger <roxana.danger@gmail.com>
wrote:

> Hi Emir,
> Many thanks for your reply.
> The UpdateProcessor can do this work, but is analyzer.reusableTokenStream
> <https://lucene.apache.org/core/3_0_3/api/core/org/apache/lucene/analysis/
> Analyzer.html#reusableTokenStream(java.lang.String,
> java.io.Reader)> the way to obtain a previous generated tokenstream? is it
> guarantee to get access to the token stream and not reconstruct it?
> Thanks,
> Roxana
>
>
> On Wed, Nov 22, 2017 at 10:26 AM, Emir Arnautović <
> emir.arnautovic@sematext.com> wrote:
>
> > Hi Roxana,
> > I don’t think that it is possible. In some cases (seems like yours is
> good
> > fit) you could create custom update request processor that would do the
> > shared analysis (you can have it defined in schema) and after analysis
> use
> > those tokens to create new values for those two fields and remove source
> > value (or flag it as ignored in schema).
> >
> > HTH,
> > Emir
> > --
> > Monitoring - Log Management - Alerting - Anomaly Detection
> > Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> >
> >
> >
> > > On 22 Nov 2017, at 11:09, Roxana Danger <roxana.danger@gmail.com>
> wrote:
> > >
> > > Hello all,
> > >
> > > I would like to reuse the tokenstream generated for one field, to
> create
> > a
> > > new tokenstream (adding a few filters to the available tokenstream),
> for
> > > another field without the need of executing again the whole analysis.
> > >
> > > The particular application is:
> > > - I have field *tokens* that uses an analyzer that generate the tokens
> > (and
> > > maintains the token type attributes)
> > > - I would like to have another two new fields: *verbs* and
> *adjectives*.
> > > These should reuse the tokenstream generated for the field *tokens* and
> > > filter the verbs and adjectives for the respective fields.
> > >
> > > Is this feasible? How should it be implemented?
> > >
> > > Many thanks.
> >
> >
>



-- 
Sincerely yours
Mikhail Khludnev

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message