lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lance Norskog <goks...@gmail.com>
Subject Re: Help with creating a solr schema
Date Sun, 03 Jan 2010 01:44:00 GMT
Another option is to model this problem in Solr with an even more
denormalized schema: you have one document per person per day. So,
instead of:
id=0 user=Alice start_date:1-Jan-2010  end_date:5-Jan-2010
you have:
id=0 user=Alice date:1-Jan-2010
id=1 user=Alice date:2-Jan-2010
id=2 user=Alice date:3-Jan-2010
id=3 user=Alice date:4-Jan-2010
id=4 user=Alice date:5-Jan-2010

For convenience in searching,I would do this with useful id values:
id=Alice_1-Jan-2010 user=Alice date:1-Jan-2010

Solr can handle hundreds of millions of documents quite well.

Also, using the date type for the dates allows you to use the date
range and facet options, which are more efficient that searching on
strings.

Lance

On Fri, Jan 1, 2010 at 9:38 PM, Israel Ekpo <israelekpo@gmail.com> wrote:
> On Fri, Jan 1, 2010 at 9:47 PM, JaredM <emrul.i@gmail.com> wrote:
>
>>
>> Thanks Ahmet and Israel.  I prefer Israel's approach since the amount of
>> metadata for the user is quite high but I'm not clear how to get around one
>> problem:
>>
>> If I had 2 availabilities (I've left it in human-readable form instead of
>> as
>> a UNIX timestamp only for ease of understanding):
>>
>> <field name="start_date">10-Jan-2010</field>
>> <field name="start_date">20-Jan-2010</field>
>> <field name="end_date">25-Jan-2010</field>
>> <field name="end_date">28-Jan-2010</field>
>>
>> and I wanted to query for availability between 12-Jan-2010 to 26-Jan-2010
>> then then wouldn't the above document be returned (even though the use
>> would
>> not be available 20-25 Jan?
>> --
>> View this message in context:
>> http://old.nabble.com/Help-with-creating-a-solr-schema-tp26979319p26990178.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
>
> Unfortunately,
>
> For this particular use case, if you are using the out-of-the-box features
> available in Solr 1.4, without a custom Solr plugin using a custom Lucene
> filter and some special value storage arrangement for the fields, you will
> have to store each start and end date as a separate document. So, there will
> be N separate documents for each username if that user has N distinct
> periods of availabilty. The start date and end date fields would also have
> to be single valued instead of multi-valued as I specified in the earlier
> post.
>
> Sorry.
> --
> "Good Enough" is not good enough.
> To give anything less than your best is to sacrifice the gift.
> Quality First. Measure Twice. Cut Once.
> http://www.israelekpo.com/
>



-- 
Lance Norskog
goksron@gmail.com

Mime
View raw message