lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tomás Fernández Löbbe <tomasflo...@gmail.com>
Subject Re: Slow faceting performance on a docValues field
Date Tue, 13 Jan 2015 18:14:15 GMT
Range Faceting won't use the DocValues even if they are there set, it
translates each gap to a filter. This means that it will end up using the
FilterCache, which should cause faster followup queries if you repeat the
same gaps (and don't commit).
You may also want to try interval faceting, it will use DocValues instead
of filters. The API is different, you'll have to provide the intervals
yourself.

Tomás

On Tue, Jan 13, 2015 at 10:01 AM, Shawn Heisey <apache@elyograg.org> wrote:

> On 1/13/2015 10:35 AM, David Smith wrote:
> > I have a query against a single 50M doc index (175GB) using Solr 4.10.2,
> that exhibits the following response times (via the debugQuery option in
> Solr Admin):
> > "process": {
> >  "time": 24709,
> >  "query": { "time": 54 }, "facet": { "time": 24574 },
> >
> >
> > The query time of 54ms is great and exactly as expected -- this example
> was a single-term search that returned 3 hits.
> > I am trying to get the facet time (24.5 seconds) to be sub-second, and
> am having no luck.  The facet part of the query is as follows:
> >
> > "params": { "facet.range": "eventDate",
> >  "f.eventDate.facet.range.end": "2015-05-13T16:37:18.000Z",
> >  "f.eventDate.facet.range.gap": "+1DAY",
> >  "start": "0",
> >
> >  "rows": "10",
> >
> >  "f.eventDate.facet.range.start": "2005-03-13T16:37:18.000Z",
> >
> >  "f.eventDate.facet.mincount": "1",
> >
> >  "facet": "true",
> >
> >  "debugQuery": "true",
> >  "_": "1421169383802"
> >  }
> >
> > And, the relevant schema definition is as follows:
> >
> >    <field name="eventDate" type="tdate" indexed="true" stored="true"
> multiValued="false" docValues="true"/>
> >
> >     <!-- A Trie based date field for faster date range queries and date
> faceting. -->
> >     <fieldType name="tdate" class="solr.TrieDateField" precisionStep="6"
> positionIncrementGap="0"/>
> >
> >
> > During the 25-second query, the Solr JVM pegs one CPU, with little or no
> I/O activity detected on the drive that holds the 175GB index.  I have 48GB
> of RAM, 1/2 of that dedicated to the OS and the other to the Solr JVM.
> >
> > I do NOT have any fieldValue caches configured as yet, because my
> (perhaps too simplistic?) reading of the documentation was that DocValues
> eliminates the need for a field-level cache on this facet field.
>
> 24GB of RAM to cache 175GB is probably not enough in the general case,
> but if you're seeing very little disk I/O activity for this query, then
> we'll leave that alone and you can worry about it later.
>
> What I would try immediately is setting the facet.method parameter to
> enum and seeing what that does to the facet time.  I've had good luck
> generally with that, even in situations where the docs indicated that
> the default (fc) was supposed to work better.  I have never explored the
> relationship between facet.method and docValues, though.
>
> I'm out of ideas after this.  I don't have enough experience with
> faceting to help much.
>
> Thanks,
> Shawn
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message