lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Derek Poh <d...@globalsources.com>
Subject Re: 1 main collection or multiple smaller collections?
Date Fri, 28 Apr 2017 02:18:55 GMT
Hi Shawn

1 set of data is suppliers info and 1 set isthe suppliers products info.
Usercan eitherdo a product search or a supplier search.

1 optionI am thinking of is to put them in 1 single collectionwith each 
product as a document. Each productdocument will have the supplier info 
in it.
Product id will be the uniquekey field.
With thisoption, the same supplier infowill be in every product document 
of the supplier.

A simplified example:
doc:
product id: P1
product description: XXX
supplier id: S1
supplier name: XXX
suppiler address: XXX

doc:
product id: P2
product description: XXXYYY
supplier id: S1
supplier name: XXX
supplier address: XXX

I may be influenced by DB concepts. Is such a design logical?


On 4/27/2017 8:50 PM, Shawn Heisey wrote:
> On 4/26/2017 11:57 PM, Derek Poh wrote:
>> There are some common fields between them.
>> At the source data end (database), the supplier info and product info
>> are updated separately. In this regard, I should separate them?
>> If it's In 1 single collection, when there are updatesto only the
>> supplier info,the product info will be index again even though there
>> is noupdates to them, Is my reasoning valid?
>>
>>
>> On 4/27/2017 1:33 PM, Walter Underwood wrote:
>>> Do they have the same fields or different fields? Are they updated
>>> separately or together?
>>>
>>> If they have the same fields and are updated together, I’d put them
>>> in the same collection. Otherwise, probably separate.
> Walter's statements are right on the money, you just might need a little
> more detail.
>
> There are are two critical details that decide whether you even CAN
> combine different data in a single index: One is that all types of
> records must use the same field (the uniqueKey field) to determine
> uniqueness, and the value of this field must be unique across the entire
> dataset.  The other is that there SHOULD be a field with a name like
> "type" that your search client can use to differentiate the different
> kinds of documents.  This type field is not necessary, but it does make
> things easier.
>
> Assuming you CAN combine documents, there is still the question of
> whether you SHOULD.  If the fields that you will commonly search are the
> same between the different kinds of documents, and if people want to be
> able to do one search and get more than one of the document types you
> are indexing, then it is something you should consider.  If people will
> only ever search one type of document, you should probably keep them in
> separate indexes to keep things cleaner.
>
> Thanks,
> Shawn
>
>


----------------------
CONFIDENTIALITY NOTICE 

This e-mail (including any attachments) may contain confidential and/or privileged information.
If you are not the intended recipient or have received this e-mail in error, please inform
the sender immediately and delete this e-mail (including any attachments) from your computer,
and you must not use, disclose to anyone else or copy this e-mail (including any attachments),
whether in whole or in part. 

This e-mail and any reply to it may be monitored for security, legal, regulatory compliance
and/or other appropriate reasons.
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message