lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Reitzel, Charles" <>
Subject RE: Structured and Unstructured data indexing in SolrCloud
Date Mon, 30 Mar 2015 19:49:59 GMT
Hi Vijay, 

The short answer is yes, you can combine almost anything you want into a single collection.
  But, in addition to working out your queries, you might want work out your data life cycle.

In our application, we have comingled the structured and unstructured documents into a single
collection for initial development purposes.   The only field they have in common is the unique
ID.    Works fine.

In production, however, we see things like query rates, access controls, load balancing, availability,
shard keys, overall document counts, update frequency, etc. will drive us to use separate
collections.  For us, the deciding factor is less about "structured vs. unstructured" and
more about "public vs. private".   We have developed our app so that splitting the collection
will have minimal impact by executing separate queries, in parallel, at runtime.   

Of course, your application is different.  YMMV, etc.


-----Original Message-----
From: Jack Krupansky [] 
Sent: Sunday, March 29, 2015 4:26 PM
Subject: Re: Structured and Unstructured data indexing in SolrCloud

The first step is to work out the queries that you wish to perform - that will determine how
the data should be organized in the Solr schema.

-- Jack Krupansky

On Sun, Mar 29, 2015 at 4:04 PM, Vijay Bhoomireddy <>

> Hi,
> We have a requirement where both structured and unstructured data 
> comes into the system. We need to index both of them and then enable 
> search functionality on it. We are using SolrCloud on Hadoop platform. 
> For structured data, we are planning to put the data into HBase and 
> for unstructured, directly into HDFS.
> My question is how to index these sources under a single Solr core? 
> Would that be possible to index both structured and unstructured data 
> under a single core/collection in SolrCloud and then enable search 
> functionality over that index?
> Thanks in advance.
> --
> The contents of this e-mail are confidential and for the exclusive use 
> of the intended recipient. If you receive this e-mail in error please 
> delete it from your system immediately and notify us either by e-mail 
> or telephone. You should not copy, forward or otherwise disclose the 
> content of the e-mail. The views expressed in this communication may 
> not necessarily be the view held by WHISHWORKS.

This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and then delete

View raw message