lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benson Margulies <ben...@basistech.com>
Subject Re: PositionLengthAttribute
Date Sat, 07 Sep 2013 12:47:49 GMT
On Sat, Sep 7, 2013 at 8:39 AM, Robert Muir <rcmuir@gmail.com> wrote:
> On Sat, Sep 7, 2013 at 7:44 AM, Benson Margulies <benson@basistech.com> wrote:
>> In Japanese, compounds are just decompositions of the input string. In
>> other languages, compounds can manufacture entire tokens from thin
>> air. In those cases, it's something of a question how to decide on the
>> offsets. I think that you're right, eventually, insofar as there's
>> some offset in the original that might as well be blamed for any given
>> component.
>>
>
> Why change the offsets then? Offsets are for highlighting. Let the
> whole compound be highlighted when its a match in search results. Its
> transparent and totally accurate as to what is happening: this is why
> we do highlighting, to aid the user can make a relevance assessment
> about the document, not to try to assist the end user to debug the
> analysis chain or anything like that.

Thanks, that's very helpful. I spend all my time crawling around the
underside of this stuff and I lack perspective.


>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message