lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yasufumi Mizoguchi <yasufumi0...@gmail.com>
Subject Re: What is the benefit of stored="true" in *PointFields
Date Thu, 07 Feb 2019 02:24:56 GMT
Hi, Shawn.

Thank you for replying me.

> Stored is smaller than docValues -- it's compressed, and docValues aren't.
Actually, stored is compressed but I believed that docValues was compressed
in some strategies depending on
field's values/density as following java doc says.
https://lucene.apache.org/core/7_6_0/core/org/apache/lucene/codecs/lucene70/Lucene70DocValuesFormat.html

> Removing either docValues or stored on a numeric type is probably not
> going to make much difference in the total size of the index unless
> there are billions of documents.

Yes, I tried stored="false" on some numeric fields, but it was not good.
So, I am trying to set stored="false" on some string fields...

Thank you for your advice,
Yasufumi.



2019年2月7日(木) 0:48 Shawn Heisey <apache@elyograg.org>:

> On 2/6/2019 12:42 AM, Yasufumi Mizoguchi wrote:
> > I am using Solr 7.6 and want to reduce index size due to hardware
> > limitation.
> > I already tried to
> >   1. set false to unnecessary field's indexed/stored/docValues parameter
> in
> > schema.
> >   2. set compressionMode="BEST_COMPRESSION" in solrconfig.
> >
> > These were quite good, but I still need to reduce index size.
> >
> > Then, I am now planning to set stored="false" in *PointFields only used
> for
> > range query,
> > faceting and sorting. Because I think that docValues="true" is enough to
> > acquire field's
> > value thanks to useDocValuesAsStored parameter.
> >
> > But I also think this might lead to bad query performance...
>
> Stored values have pretty much zero bearing on query performance.
>
> Stored is smaller than docValues -- it's compressed, and docValues aren't.
>
> If you do not need docValues for some other aspect, like faceting or
> sorting, then choose stored.  If you need docValues for something, then
> choose docValues.
>
> Removing either docValues or stored on a numeric type is probably not
> going to make much difference in the total size of the index unless
> there are billions of documents.
>
> On a point type, queries like "field:333" will be slow.  This is the
> nature of a point type.  If you will frequently make queries for
> individual values, the Trie types (deprecated, will be removed in 8.0)
> are better.  Range queries (like "field:[444 TO 555]") perform best on a
> point type.
>
> Thanks,
> Shawn
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message