lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tricia Williams <williams.tri...@gmail.com>
Subject Re: Why isn't the DateField implementation of ISO 8601 broader?
Date Tue, 06 Oct 2009 22:38:19 GMT
Thanks for making me think about this a little bit deeper, Hoss.  
Comments in-line.

Chris Hostetter wrote:
> because those would be ambiguous.  if you just indexed field:2001-03 would 
> you expect it to match field:[2001-02-28T00:00:00Z TO 
> 2001-03-13T00:00:00Z] ... what about date faceting, what should the 
> counts be if you facet per day?
>   

I would expect field:2001-03 to be a hit on a partial match such as 
field:[2001-02-28T00:00:00Z TO 2001-03-13T00:00:00Z].  I suppose that my 
expectation would be that field:2001-03 would be counted once per day 
for each day in its range. It would follow that a user looking for 
documents relating to 1919 might also be interested in 1910.  But 
conversely a user looking for documents relating to 1919 might really 
only want documents specifically related to 1919.  Maybe the 
implementation would be smart (or configurable) about precision so that 
it wouldn't be counted when the precision asked to be represented by 
facets had more significant figures that the indexed/stored value.  
Maybe there would be another facet category at each precision for 
"others" -- the documents that have less precision than the current date 
facet precision.  I'm envisioning a hierarchical system that starts 
general with century with click-throughs drilling down eventually to days.

> ...your expectations may be different then everyone elses.  by requiring 
> that the dates be explicit there is no ambiguity, you are in control of 
> the behavior.
>   

I can see your point but surely there are others out there with non 
explicit data regarding dates out there?  Does my use case makes sense 
to anyone else?

> in can always just index the first date of whatever block of time (month, 
> yera, century, etc..) and then facet normally.
>
>   
Until a better solution presents itself we've gone the route of creating 
more fields for faceting on different blocks of time.  So fields for 
century, decade, year, month, and day will let us facet on each of these 
time periods as needed.  Documents with dates with less precision will 
not show up in date facets with more precision.  I was hoping there was 
an elegant hack for faceting on prefix of a defined number of characters 
(prefix=*, prefix=**, prefix=***, ...) without having to explicitly 
specify ..., prefix=188, prefix=189, prefix=190, prefix=191, ...

Regards,
Tricia

Mime
View raw message