lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tricia Williams <williams.tri...@gmail.com>
Subject Re: Why isn't the DateField implementation of ISO 8601 broader?
Date Wed, 07 Oct 2009 15:55:40 GMT
Chris Hostetter wrote:
> : I would expect field:2001-03 to be a hit on a partial match such as
> : field:[2001-02-28T00:00:00Z TO 2001-03-13T00:00:00Z].  I suppose that my
> : expectation would be that field:2001-03 would be counted once per day for each
> : day in its range. It would follow that a user looking for documents relating
>
> ...meanwhile someone else might expect that unless the ambiguous date must 
> be entirely contained within the range being queried on.
>   
If implemented in DateField I guess this behaviour would need to be 
configurable.
> (your implication of counting once per day would have pretty weird results 
> on faceting by the way)
>   
I agree.  It would be possible to have one document hit on a query but 
have hundreds of facet categories with a count of one under this 
scheme.  I'm leaning towards the scenario I described where the document 
would be counted once in an "other" facet category if it is relevant 
through rounding.
> with unambiguous dates, you can have exactly what you want just by being a 
> little more verbose when indexing/quering, (and somoene else can have 
> exactly what they want by being equally verbose using slightly differnet 
> options/queries
>
> in your case: i would suggest that you use two fields: date_low and 
> date_high ... when you have an exact date (down to the smallest level of 
> granularity you care about) you put the same value in both fields, when 
> you have an ambiguous value (like 2001-03) you put the largest value 
> possible in date_high and the lowest value possible in date_low (ie: 
> date_low:2001-03-01T00:00:00Z & date_high:2001-03-31T23:59:59.999Z) then a 
> query for anything *overlapping* the range from feb28 to march 13 would 
> be...
>
> +date_low:[* TO 2001-03-13T00:00:00Z] +date_high:[2001-02-28T00:00:00Z TO *]
>
> ...it works for ambiguous dates, and it works for exact dates.
>
> (someone else who only wants to see matches if the ranges *completely* 
> overlap would just swap which end point they queried against which field)
>   
We've had a really similar solution in place for range queries for a 
while.  Our current problem is really faceting.

Thanks,
Tricia

Mime
View raw message