lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andres de la Peña <adelap...@stratio.com>
Subject Re: LUCENE-6766 index sorting and custom SortField
Date Tue, 16 Aug 2016 20:21:28 GMT
I see. Each project not using the supported types should create its own
SegmentInfoFormat able to serialize the involved sort fields. This should
be provided by a custom FilterCodec, registered with the SPI mechanism.
Also the checkings about supported SortFields should be skipped in some way.

I think that autoserializable SortFields should be considerably easier to
use, keeping the purpose of LUCENE-6766: to ease the usage of index
sorting, thus not being an expert feature.

2016-08-16 20:56 GMT+01:00 Michael McCandless <lucene@mikemccandless.com>:

> It would be a custom Codec implementation, where your codec could then
> make its own SegmentInfoFormat that knew how to serialize the custom sort
> field you had set on IndexWriterConfig.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> On Tue, Aug 16, 2016 at 2:51 PM, Andres de la Peña <adelapena@stratio.com>
> wrote:
>
>> Do you mean creating a custom SegmentInfoFormat per each custom
>> SortField?
>>
>> 2016-08-16 10:20 GMT+01:00 Michael McCandless <lucene@mikemccandless.com>
>> :
>>
>>> I like that idea!
>>>
>>> Mike McCandless
>>>
>>> http://blog.mikemccandless.com
>>>
>>> On Tue, Aug 16, 2016 at 4:45 AM, Adrien Grand <jpountz@gmail.com> wrote:
>>>
>>>> Maybe another way would be to create a custom SegmentInfoFormat that
>>>> handles the serialization of this custom SortField. That would put the
>>>> burden on the user to handle backward compatibility, but on the other
>>>> hand
>>>> it would not require SortFields to handle their own serialization? It
>>>> would
>>>> not work today since IndexWriterConfig does its own checks, but maybe
>>>> this
>>>> is something we can fix to allow for custom sort orders?
>>>>
>>>> Le lun. 15 août 2016 à 20:57, Michael McCandless <
>>>> lucene@mikemccandless.com>
>>>> a écrit :
>>>>
>>>> > Hmm I see.  Yeah, it seems like the only way forward is to explore
>>>> > SortField (and its subclasses) handling their own serialization,
>>>> maybe via
>>>> > SPI (what we use for codecs), though that sounds somewhat heavy.
>>>> Maybe
>>>> > open an issue for discussion?
>>>> >
>>>> > Mike McCandless
>>>> >
>>>> > http://blog.mikemccandless.com
>>>> >
>>>> > On Mon, Aug 15, 2016 at 12:12 PM, Andres de la Peña <
>>>> adelapena@stratio.com
>>>> > >
>>>> > wrote:
>>>> >
>>>> > > Hi,
>>>> > >
>>>> > > We are using a custom SortField
>>>> > > <
>>>> > https://github.com/Stratio/cassandra-lucene-index/blob/branc
>>>> h-3.0.8/plugin/src/main/java/com/stratio/cassandra/lucene/ke
>>>> y/KeySort.java
>>>> > >
>>>> > > to sort Cassandra primary keys. The sort criteria is based on the
>>>> > > marshalled values of each of the columns in the primary key, so
it
>>>> is not
>>>> > > trivial at all to compute an equivalent collated value to be
>>>> indexed in
>>>> > doc
>>>> > > values.
>>>> > >
>>>> > > Maybe it could be possible to define how to do this
>>>> > > serialization-deserialization when extending SortField. This way
it
>>>> will
>>>> > be
>>>> > > possible to recover this lost Lucene 5.x feature, don't you think
>>>> so?
>>>> > >
>>>> > > Thanks,
>>>> > >
>>>> > > 2016-08-14 23:09 GMT+01:00 Michael McCandless <
>>>> lucene@mikemccandless.com
>>>> > >:
>>>> > >
>>>> > >> Unfortunately, as of LUCENE-6766, index sorting only supports
>>>> simple
>>>> > sort
>>>> > >> types.  This was needed because Lucene needs to be able to
easily
>>>> > serialize
>>>> > >> and de-serialize the sort order into the index.
>>>> > >>
>>>> > >> Can you compute your sort criteria and index it as a doc values
>>>> field
>>>> > and
>>>> > >> then sort by that?
>>>> > >>
>>>> > >> Or, patches welcome too ;)
>>>> > >>
>>>> > >> Mike McCandless
>>>> > >>
>>>> > >> http://blog.mikemccandless.com
>>>> > >>
>>>> > >> On Sun, Aug 14, 2016 at 7:19 AM, Andres de la Peña <
>>>> > adelapena@stratio.com
>>>> > >> > wrote:
>>>> > >>
>>>> > >>> Hi,
>>>> > >>>
>>>> > >>> LUCENE-6766 allows to define index sorting on IndexWriterConfig
>>>> instead
>>>> > >>> of
>>>> > >>> defining a SortingMergePolicy. However, the new index sorting
only
>>>> > >>> supports
>>>> > >>> some types of sort fields, and the old SortingMergePolicy,
which
>>>> didn't
>>>> > >>> have this limitation, has been removed.
>>>> > >>>
>>>> > >>> What should do projects depending on index sort with custom
>>>> SortFields?
>>>> > >>> Ignore the new index writer sorting and build their own
old-style
>>>> > sorting
>>>> > >>> merge policy?
>>>> > >>>
>>>> > >>> Thanks,
>>>> > >>>
>>>> > >>> --
>>>> > >>> Andrés de la Peña
>>>> > >>>
>>>> > >>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>> > >>> 28224 Pozuelo de Alarcón, Madrid
>>>> > >>> Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd
>>>> > >>> <https://twitter.com/StratioBD>*
>>>> > >>>
>>>> > >>
>>>> > >>
>>>> > >
>>>> > >
>>>> > > --
>>>> > > Andrés de la Peña
>>>> > >
>>>> > > Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>> > > 28224 Pozuelo de Alarcón, Madrid
>>>> > > Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd
>>>> > > <https://twitter.com/StratioBD>*
>>>> > >
>>>> >
>>>>
>>>
>>>
>>
>>
>> --
>> Andrés de la Peña
>>
>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>> 28224 Pozuelo de Alarcón, Madrid
>> Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd
>> <https://twitter.com/StratioBD>*
>>
>
>


-- 
Andrés de la Peña

Vía de las dos Castillas, 33, Ática 4, 3ª Planta
28224 Pozuelo de Alarcón, Madrid
Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd
<https://twitter.com/StratioBD>*

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message