lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Olivier H. Beauchesne" <oliv...@olihb.com>
Subject Re: filtering facets
Date Mon, 31 Aug 2009 13:45:42 GMT
Hi Mike,

No, my problem is that the field article_outlinks is multivalued thus it 
contains several urls not related to my search. I would like to facet 
only urls matching my query.

For exemple(only on one document, but my search targets over 1M docs):

Doc1:
article_url:
url1.com/1
url2.com/2
url1.com/1
url1.com/3

And my query is: article_url:url1.com* and I facet by article_url and I 
want it to give me:
url1.com/1 (2)
url1.com/3 (1)

But right now, because url2.com/2 is contained in a multivalued field 
with the matching urls, I get this:
url1.com/1 (2)
url1.com/3 (1)
url2.com/2 (1)

I can use facet.prefix to filter, but it's not very flexible if my url 
contains a subdomain as facet.prefix doesn't support wildcards.

Thank you,

Olivier

Mike Topper a écrit :
> Hi Olivier,
>
> are the facet counts on the urls you dont want 0?
>
> if so you can use facet.mincount to only return results greater than 0.
>
> -Mike
>
> Olivier H. Beauchesne wrote:
>   
>> Hi,
>>
>> Long time lurker, first time poster.
>>
>> I have a multi-valued field, let's call it article_outlinks containing
>> all outgoing urls from a document. I want to get all matching urls
>> sorted by counts.
>>
>> For exemple, I want to get all outgoing wikipedia url in my documents
>> sorted by counts.
>>
>> So I execute a query like this:
>> q=article_outlinks:http*wikipedia.org*  and I facet on article_outlinks
>>
>> But I get facets containing the other urls in the documents. I can get
>> something close by using facet.prefix=http://en.wikipedia.org but I
>> want to include other subdomains on wikipedia (ex: fr.wikipedia.org).
>>
>> Is there a way to do a search and getting facets only matching my query?
>>
>> I know facet.prefix isn't a query, but is there a way to get that
>> behavior?
>>
>> Is it easy to extend solr to do something like that?
>>
>> Thank you,
>>
>> Olivier
>>
>> Sorry for my english.
>>
>>     
>
>
>   

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message