lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shai Erera <ser...@gmail.com>
Subject Re: SOLR/LUCENE 5.2.1: Solution of CharTermAtt, StartOffset, EndOffset, Position
Date Sat, 08 Aug 2015 04:11:39 GMT
I think you can just write a TokenFilter which sets the
PositionIncrementAttribute of every other token to 0. Then you can use
StandardTokenizer and wrap it with that filter.

Shai
On Aug 8, 2015 6:33 AM, "Văn Châu" <vankimchau@gmail.com> wrote:

> Hi,
>
> I'm looking a solution for the following format in solr/lucene 5.2.1
> version:
> Text eg: "fast wi fi network is down". If using
> solr.StandardTokenizerFactory , I have the "Position " corresponding to
> displayed : fast ( 1 ) - > wi ( 2 ) - > fi ( 3 ) - > Network ( 4 ) - > is
(
> 5 ) - - > down ( 6 ) . But I need you just create a new custom or class to
> the question above is "fast wi fi network is down" but the analysis is
> currently Position as follows : fast ( 1 ) - > fi ( 2 ) - > is ( 3 ) or wi
> ( 1 ) - > network ( 2 ) - > down ( 3 ) . I know it involves startOffset ,
> endOffset ... but I can not figure out how to solve?
> Thanks in advance!
>
>
> [image: Hình ảnh nội tuyến 1]
>
>
>
> ---------------------------
> VĂN KIM CHÂU
> [P]: +84.933.233.047
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message