lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cristian Lorenzetto <cristian.lorenze...@gmail.com>
Subject Re: encoding in byteref?
Date Thu, 18 Aug 2016 09:16:16 GMT
in 6.1.0 version  BigIntegerPoint seams be moved in the mains module (no
more in sandbox).
However

1) BigIntegerPoint seams be a class for searching a 128integer not for
sorting. NumericDocValuesField supports long not BigInteger. so I used for
sorting SortedDocValuesField.
2) BigIntegerPoint name maybe is misleading , maybe is better
Integer128Point :)






2016-08-11 14:31 GMT+02:00 Michael McCandless <lucene@mikemccandless.com>:

> To index into postings, use TextField (analyzes text into tokens) or
> StringField (indexes entire string as one token).  E.g. you could map
> boolean true to StringField("true").
>
> See BigIntegerPoint in lucene's sandbox module.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> On Wed, Aug 10, 2016 at 6:16 AM, Cristian Lorenzetto <
> cristian.lorenzetto@gmail.com> wrote:
>
> > thanks for suggestion about postings (i think you mean "posting format" ,
> > just found mentions in google now :)) I have difficulty anyway to find a
> > example how to use postings. Any example how to use postings in code ?
> just
> > a link for example?
> >
> > *Passing to docvalues* :
> > in version 6.1 docvalues (like points) seams to use a additional field
> for
> > storing: StoredField. so if i understand correctly , how to encode data
> of
> > different types for storedfield is not important.
> > if i want order (boolean,short,int ... without NumericPrecisionStep ) how
> > to do? it seams from code docvalues now not storing data but just save
> the
> > internal references (linked list) for saving order, so maybe is not
> > important what number type i m using.
> > if i want order a biginteger how to do?
> >
> > -
> >
> > 2016-08-10 11:49 GMT+02:00 Adrien Grand <jpountz@gmail.com>:
> >
> > > It would make little sense to use points for a boolean field in the 1D
> > case
> > > since there are only two possible values, postings would likely be
> faster
> > > and use less disk space thanks to their skipping capabilities and
> better
> > > doc ID compression. Even with multiple dimensions, postings might still
> > be
> > > a better option eg. with 3 dims of boolean fields, there are only 8
> > > possible combinations.
> > >
> > >
> > >
> > > Le mer. 10 août 2016 à 11:41, Cristian Lorenzetto <
> > > cristian.lorenzetto@gmail.com> a écrit :
> > >
> > > > in addition in the previous version of my code i used
> > > > TYPE.setNumericPrecisionStep for setting the precision of a number in
> > > > docvalues. Now i saw it is deprecated.
> > > >        So i have a similar question also in this case: it is still
> > > possible
> > > > to use less space for (byte,boolean,short,int) types?
> > > >
> > > >
> > > > 2016-08-10 11:35 GMT+02:00 Cristian Lorenzetto <
> > > > cristian.lorenzetto@gmail.com>:
> > > >
> > > > > ok thanks so i can do them.
> > > > > but for boolean type?  i could compress using bit. Is there pack
> > > function
> > > > > for boolean arrays?
> > > > >
> > > > > 2016-08-10 11:25 GMT+02:00 Michael McCandless <
> > > lucene@mikemccandless.com
> > > > >:
> > > > >
> > > > >> It's partially right!
> > > > >>
> > > > >> E.g. IndexWriter will use less memory, and so you'll get better
> > > indexing
> > > > >> throughput with a ShortPoint and BytePoint.
> > > > >>
> > > > >> But index size will be the same, because Lucene's default codec
> > does a
> > > > >> good
> > > > >> job compressing these values.
> > > > >>
> > > > >> Mike McCandless
> > > > >>
> > > > >> http://blog.mikemccandless.com
> > > > >>
> > > > >> On Wed, Aug 10, 2016 at 5:19 AM, Cristian Lorenzetto <
> > > > >> cristian.lorenzetto@gmail.com> wrote:
> > > > >>
> > > > >> > sorry but I was developping a shortPoint and BytePoint for
less
> > > using
> > > > >> less
> > > > >> > memory space. it is wrong?
> > > > >> >
> > > > >> > 2016-08-09 22:01 GMT+02:00 Michael McCandless <
> > > > >> lucene@mikemccandless.com>:
> > > > >> >
> > > > >> > > It's best to index numeric using the new dimensional
points,
> > e.g.
> > > > >> > IntPoint.
> > > > >> > >
> > > > >> > > Mike McCandless
> > > > >> > >
> > > > >> > > http://blog.mikemccandless.com
> > > > >> > >
> > > > >> > > On Tue, Aug 9, 2016 at 10:12 AM, Cristian Lorenzetto
<
> > > > >> > > cristian.lorenzetto@gmail.com> wrote:
> > > > >> > >
> > > > >> > > > how to encode a short or a byte type in byteRef
in lucene
> 6.1?
> > > > >> > > >
> > > > >> > >
> > > > >> >
> > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message