lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: encoding in byteref?
Date Thu, 11 Aug 2016 12:31:07 GMT
To index into postings, use TextField (analyzes text into tokens) or
StringField (indexes entire string as one token).  E.g. you could map
boolean true to StringField("true").

See BigIntegerPoint in lucene's sandbox module.

Mike McCandless

http://blog.mikemccandless.com

On Wed, Aug 10, 2016 at 6:16 AM, Cristian Lorenzetto <
cristian.lorenzetto@gmail.com> wrote:

> thanks for suggestion about postings (i think you mean "posting format" ,
> just found mentions in google now :)) I have difficulty anyway to find a
> example how to use postings. Any example how to use postings in code ? just
> a link for example?
>
> *Passing to docvalues* :
> in version 6.1 docvalues (like points) seams to use a additional field  for
> storing: StoredField. so if i understand correctly , how to encode data of
> different types for storedfield is not important.
> if i want order (boolean,short,int ... without NumericPrecisionStep ) how
> to do? it seams from code docvalues now not storing data but just save the
> internal references (linked list) for saving order, so maybe is not
> important what number type i m using.
> if i want order a biginteger how to do?
>
> -
>
> 2016-08-10 11:49 GMT+02:00 Adrien Grand <jpountz@gmail.com>:
>
> > It would make little sense to use points for a boolean field in the 1D
> case
> > since there are only two possible values, postings would likely be faster
> > and use less disk space thanks to their skipping capabilities and better
> > doc ID compression. Even with multiple dimensions, postings might still
> be
> > a better option eg. with 3 dims of boolean fields, there are only 8
> > possible combinations.
> >
> >
> >
> > Le mer. 10 août 2016 à 11:41, Cristian Lorenzetto <
> > cristian.lorenzetto@gmail.com> a écrit :
> >
> > > in addition in the previous version of my code i used
> > > TYPE.setNumericPrecisionStep for setting the precision of a number in
> > > docvalues. Now i saw it is deprecated.
> > >        So i have a similar question also in this case: it is still
> > possible
> > > to use less space for (byte,boolean,short,int) types?
> > >
> > >
> > > 2016-08-10 11:35 GMT+02:00 Cristian Lorenzetto <
> > > cristian.lorenzetto@gmail.com>:
> > >
> > > > ok thanks so i can do them.
> > > > but for boolean type?  i could compress using bit. Is there pack
> > function
> > > > for boolean arrays?
> > > >
> > > > 2016-08-10 11:25 GMT+02:00 Michael McCandless <
> > lucene@mikemccandless.com
> > > >:
> > > >
> > > >> It's partially right!
> > > >>
> > > >> E.g. IndexWriter will use less memory, and so you'll get better
> > indexing
> > > >> throughput with a ShortPoint and BytePoint.
> > > >>
> > > >> But index size will be the same, because Lucene's default codec
> does a
> > > >> good
> > > >> job compressing these values.
> > > >>
> > > >> Mike McCandless
> > > >>
> > > >> http://blog.mikemccandless.com
> > > >>
> > > >> On Wed, Aug 10, 2016 at 5:19 AM, Cristian Lorenzetto <
> > > >> cristian.lorenzetto@gmail.com> wrote:
> > > >>
> > > >> > sorry but I was developping a shortPoint and BytePoint for less
> > using
> > > >> less
> > > >> > memory space. it is wrong?
> > > >> >
> > > >> > 2016-08-09 22:01 GMT+02:00 Michael McCandless <
> > > >> lucene@mikemccandless.com>:
> > > >> >
> > > >> > > It's best to index numeric using the new dimensional points,
> e.g.
> > > >> > IntPoint.
> > > >> > >
> > > >> > > Mike McCandless
> > > >> > >
> > > >> > > http://blog.mikemccandless.com
> > > >> > >
> > > >> > > On Tue, Aug 9, 2016 at 10:12 AM, Cristian Lorenzetto <
> > > >> > > cristian.lorenzetto@gmail.com> wrote:
> > > >> > >
> > > >> > > > how to encode a short or a byte type in byteRef in
lucene 6.1?
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message