lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: A few random questions about solr queries.
Date Sun, 03 Jun 2012 13:11:38 GMT
See below:

On Tue, May 29, 2012 at 6:18 AM, santamaria2 <aravinda.rao@contify.com> wrote:
> *1)* With faceting, how does facet.query perform in comparison to
> facet.field? I'm just wondering this as in my use case, I need to facet over
> a field -- which would get me the top n facets for that field, but I also
> need to show the count for a "selected filter" which might have a relatively
> low count so it doesn't appear in the top n returned facets. So the solution
> would be to 'ensure' its presence by adding a 'facet.query=cat:val' in
> addition to my facet.field=cat.

You have two choices here. Either specify that the return should
contain the "top", say,
1,000,000 responses (which would be a disaster in some cases) and
facet by field, or
facet by query. You really don't have any other choice than to add the
facet.query here
so performance is moot.

>
> I want to do this to quite a few fields.
>
> Related/example-based question:
> When I facet over a field, and something gets returned, eg: John Smith (83),
> and I also 'ensure' this facet's presence by having it in
> facet.query=author:"John Smith", are two different calculations performed?
> Or is the facet returned by facet.field also used by facet.query to obtain
> the count?
>

I'm pretty sure that two different calculations are performed, but
don't know for
certain. But again, it seems like your use-case requires the addition of the
query so why does it matter?

>
>
> *2) *Is there a performance issue if I have around, say, 20 facet.query
> conditions along with 10 facet.fields? 3/10 of those fields have around
> 100,000 possible values. Remaining have a few hundred each.
>

It Depends (tm). You don't say, for instance, how big your index is. Or how much
memory you have or..... Really, the only good way to answer this question
is to try it and _then_ worry about it. So far, you've really described your
requirements so asking low-level implementation details seems premature unless
and until you see a performance problem.

>
>
> *3)* I've rummaged around a bit, looking for info on when to use q vs fq. I
> want to clear my doubts for a certain use case.
>
> Where should my date range queries go? In q or fq? The default settings in
> my site show results from the past 90 days with buttons to show stuff from
> the last month and week as well. But the user is allowed to use a slider to
> apply any date range... this is allowed, but it's not /that/ common.
> I definitely use fq for filtering various tags. Choosing a tag is a common
> activity.
>

In addition to Shawn's answer, using &fq clauses enables using of the
filterCache
which can substantially increase performance, but see this blog post for some
interesting considerations when using NOW..

http://www.lucidimagination.com/blog/2012/02/23/date-math-now-and-filter-queries/

Best
Erick

> Should the date range query go in fq? As I mentioned, the default view shows
> stuff from the past 90 days. So on each new day does this like invalidate
> stuff in the cache? Or is stuff stored in the filtered cache in some way
> that makes it easy to fetch stuff from the past 89 days when a query is
> performed the next day?
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/A-few-random-questions-about-solr-queries-tp3986562.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Mime
View raw message