lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Avlesh Singh <avl...@gmail.com>
Subject Re: Picking Facet Fields by Frequency-in-Results
Date Tue, 04 Aug 2009 05:01:58 GMT
I understand the general need here. And just extending what you suggested
(indexing the fields themselves inside a multiValued field), you can perform
a query like this -
/search?q=myquery&facet=true&facet.field=indexedfields&facet.field=field1&facet.field=field2...&facet.sort=true

You'll get facets for all the fields (passed as multiple facet.field
params), including the one that gives you field frequency. You can do all
sorts of post processing on this data to achieve the desired.

Hope this helps.

Cheers
Avlesh

On Tue, Aug 4, 2009 at 2:20 AM, Chris Harris <ryguasu@gmail.com> wrote:

> One task when designing a facet-based UI is deciding which fields to
> facet on and display facets for. One possibility that I hope to
> explore is to determine which fields to facet on dynamically, based on
> the search results. In particular, I hypothesize that, for a somewhat
> heterogeneous index (heterogeneous in terms of which fields a given
> record might contain), that the following rule might be helpful: Facet
> on a given field to the extent that it is frequently set in the
> documents matching the user's search.
>
> For example, let's say my results look like this:
>
> Doc A:
>  f1: foo
>  f2: bar
>  f3: <N/A>
>  f4: <N/A>
>
> Doc B:
>  f1: foo2
>  f2: <N/A>
>  f3: <N/A>
>  f4: <N/A>
>
> Doc C:
>  f1: foo3
>  f2: quiz
>  f3: <N/A>
>  f4: buzz
>
> Doc D:
>  f1: foo4
>  f2: question
>  f3: bam
>  f4: bing
>
> The field usage information for these documents could be summarized like
> this:
>
> field f1: Set in 4 docs
> field f2: Set in 3 doc
> field f3: Set 1 doc
> field f4: Set 2 doc
>
> If I were choosing facet fields based on the above rule, I would
> definitely want to display facets for field f1, since occurs in all
> documents.  If I had room for another facet in the UI, I would facet
> f2. If I wanted another one, I'd go with f4, since it's more popular
> than f3. I probably would ignore f3 in any case, because it's set for
> only one document.
>
> Has anyone implemented such a scheme with Solr? Any success? (The
> closest thing I can find is
> http://wiki.apache.org/solr/ComplexFacetingBrainstorming, which tries
> to pick which facets to display based not on frequency but based more
> on a ruleset.)
>
> As far as implementation, the most straightforward approach (which
> wouldn't involve modifying Solr) would apparently be to add a new
> multi-valued "fieldsindexed" field to each document, which would note
> which fields actually have a value for each document. So when I pass
> data to Solr at indexing time, it will look something like this
> (except of course it will be in valid Solr XML, rather than this
> schematic):
>
> Doc A:
>  f1: foo
>  f2: bar
>  indexedfields: f1, f2
>
> Doc B:
>  f1: foo2
>  indexedfields: f1
>
> Doc C:
>  f1: foo3
>  f2: quiz
>  f4: buzz
>  indexedfields: f1, f2, f4
>
> Doc D:
>  f1: foo4
>  f2: question
>  f3: bam
>  f4: bing
>  indexedfields: f1, f2, f3, f4
>
> Then to chose which facets to display, I call
>
>
> http://myserver/solr/search?q=myquery&facet=true&facet.field=indexedfields&facet.sort=true
>
> and use the frequency information from this query to determine which
> fields to display in the faceting UI. (To get the actual facet
> information for those fields, I would query Solr a second time.)
>
> Are there any alternatives that would be easier or more efficient?
>
> Thanks,
> Chris
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message