lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Karich <>
Subject Re: Search document design problem
Date Tue, 17 Aug 2010 11:46:03 GMT
Hi Wenca,

I am not sure wether my information here is really helpful for you,
sorry if not ;-)

> I want only hotels that have room with 2 beds and the room has a
package with all inclusive boarding and price lower than 400.

you should tell us what you want to search and filter? Do you want only
available or all beds/rooms of a hotel?
The requirements seems to be a bit tricky but a combination of dynamic
fields and the collapse feature could do it (with only one query).

In your case I would start indexing the hotels like:
name: hilton
country: USA
city: New York
beds_i (multivalued): 2 | 1 | 1 | ...
rooms_i: 123

I am not sure how I would handle the booking/prices. Maybe you will have
to add an additional dynamic
field free_beds_periodX_i or price_periodX_i which reports the free beds
or prices for a specific period?
(where one period could be a week or even a day ...)

For the other searches I would create another index although it is
possible to put all the data in one index
and e.g. add a 'type' field to each document. With that field you can
than append a filter query to each query:
q=xy&fq=type:hotel or type:room
I would prefer this trick over the collapse feature (if you really want
to setup only one index) at the beginning
and see if this could work for you. (the collapse feature is not that
mature like the the rest of solr, but in some situations it works nicely.)

Hopes this helps a bit to get started. (Regarding the 'Data Import
Handler' I cannot help, sorry)


> Hi all,
> I would like to use Solr to replace our site search based on MySQL but
> I am not sure how to map entities into the search index. The model is
> described byt the attached UML class diagram.
> I have a Hotel that resides in some City in some Country. The hotel
> has various Rooms. For each Room in a Hotel there are some Packages
> that can be purchased by the client.
> The entity returned from the search will be mainly the Hotel. E.g.:
> - all hotels in USA
> - all hotels in New York
> - all hotels with name containing "Hilton"
> - all hotels in Egypt with packages with all inclusive boarding
>   and price lower than 400 and startDate between 2010-08-20
>   and 2010-08-30
> Our application also uses faceting a lot. e.g:
> - # of hotels per country/city
> - # of hotels based on room size
>     (# of beds - 1 bed - 100 hotels, 2 beds - 200 hotels, ...)
> - # of hotels based on all inclusive package prices
>     (0-100 EUR, 100-200 EUR, ...)
> But there are also use cases when a search should return a Room or
> Package directly.
> I'd like to use Data Import Handler to index directly from our
> database. But which approach of mapping entities into the search index
> to use? It seems to me that there are at least 2 ways.
> 1) One index based on Hotel with multivalued fields for Rooms and
> multivalued fields for Packages. In DIH:
> <document>
> <entity name="hotel" ...>
>    <field name="id" .../>
>    <entity name="room" ...>
>       <field name="room_id" .../>
>       <entity name="package"...>
>          <field .../>
>       </entity>
>    </entity>
> </entity>
> </document>
> But I am not sure whether this will work due to multivalued fields.
> The queries may span accross all the entities - I want only hotels
> that have room with 2 beds and the room has a package with all
> inclusive boarding and price lower than 400.
> 2) Denormalize data, so that there will be only one index based on
> Packages containing (duplicated) all the data from Room and Hotel and
> then use Field Collapsing on Hotel ID for search results and faceting
> too.
> This would enable also direct search for Packages or Rooms but I am
> not sure about Field Collapsing which is still a kind of beta
> functionality and about potential performance costs.
> Can anybody give me some advice or share their experiences?
> Thanks a lot
> Wenca

View raw message