lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Faceting with null dates
Date Thu, 22 Dec 2011 13:06:46 GMT
1) the number of documents for a given date range R1 that do not have
a value for the validToDate, i.e. the 99% of the documents

Makes no sense either. "for a given date range R1 that don't have a value".
You can't specify a range for a document that doesn't have a value!
I think you're asking for "the documents that *satisfy my query* that don't
have a date value". In which case Chris' suggestion to use a pure-negative
will give you what you want. You can specify arbitrary "facet.query" clauses
along with your facet.range stuff, they're just treated as separate facets. Just
tack it on your query and it'll come back in a separate section of the response.

Best
Erick

On Thu, Dec 22, 2011 at 3:45 AM, kenneth hansen <kenhans@hotmail.co.uk> wrote:
>
> yes,
> I see that my question was a bit confusing. But thanks for your answers.
> I will try to clarify a bit.
>
> I query on a date field, validToDate. The value for this field is not present for 99%
of the documents.
> What I would like to get is
> 1) the number of documents for a given date range R1 that do not have a value for the
validToDate, i.e. the 99% of the documents
> 2) the number of documents for a given date range R2 that do have a value for the validToDate
>
> My question is really: is it possible to have just one query, or do I need to have two
queries; one for 1) and one for 2). Will the "facet.range.other=all" help me in any way here?
>
> /k
>
>
>
>> Date: Thu, 15 Dec 2011 12:25:12 -0800
>> From: hossman_lucene@fucit.org
>> To: solr-user@lucene.apache.org
>> Subject: Re: Faceting with null dates
>>
>>
>> First of all, we need to clarify some terminology here: there is no such
>> thing as a "null date" in solr -- or for that matter, there is no such
>> thing as a "full value" in any field. documents either have some value(s)
>> for a field, or they do not hvae any values.
>>
>> If you want to constrain your query to only documents that have a value in
>> a field, you can use something like fq=field_name:[* TO *] ... if you want
>> to constraint your query to only documents that do *NOT* have a value in a
>> field, you can use fq=-field_name:[* TO *]
>>
>> Now, having said that, like Erick, i'm a little confused by your question
>> -- it's not clear if what you really want to do is:
>>
>> a) change the set of documents returned in the main result list
>> b) change the set of documents considered when generating facet counts
>> (w/o changing the main result list)
>> c) return an additional count of documents that are in the main result
>> list, but are not in the facet counts because they do not have the field
>> being faceted on.
>>
>> My best guess is that you are asking about "c" based on your last
>> sentence...
>>
>> : get is 3 results and 7 non-null validToDate facets. And as I write this,
>> : I start to wonder if this is possible at all as the facets are dependent
>> : on the result set and that this might be better to handle in the
>> : application layer by just extracting 10-7=3...
>>
>> ...subtracting the sum of all constraint counts from your range facet from
>> the total number of documents found won't neccessarily tell you the number
>> of documents that have no value in the field you are faceting on --
>> because documents may have values out side the range of your start/end.
>>
>> Depending on what exactly it is you are looking for, you might find the
>> "facet.range.other=all" param useful, as it will return things like the
>> "between" counts (summing up all the docs between start->end) as well as
>> the "before" and "after" counts.
>>
>> But if you really just want to know "how many docs have no value for my
>> validToDate field?" you can get that very explicitly and easily using
>> facet.query=-validToDate:[* TO *]
>>
>> : <code><str name="facet">true</str><str
>> : name="f.validToDate.facet.range.start">NOW/DAYS-4MONTHS</str><str
>> : name="facet.mincount">1</str><str name="q">(*:*)</str><arr
>> : name="facet.range"><str>validToDate</str></arr><str
>> : name="facet.range.end">NOW/DAY+1DAY</str><str
>> : name="facet.range.gap">+1MONTH</str></code>
>> :
>> : <result name="response" numFound="10" start="0"><lst
>> : name="facet_counts"><lst name="facet_ranges"> <lst name="validToDate">
>> : <lst name="counts"> <int name="2011-11-14T00:00:00Z">7</int>
>>
>>
>> -Hoss
>

Mime
View raw message