Thanks a lot Otis,

While reading the SolrCloud documentation to understand how SolrCloud could run on HDFS, I got confused with leader, replica, "non-replica" shards, core, index, and collections.

Once it is specified that one cannot add shards, then that one can add replica-only shards, then that last "Shard Splitting" paragraph states that something changed starting with Solr 4.3.

But it doesn't states that splitting shards can end in a new non-replica shard, in a just added node, thus increasing the amount of storage available to the index / collection. It states that "split action effectively makes two copies of the data as new shards" instead, which tastes a lot like replica style shards.

So does it?

Could there be some sort of tutorial describing how to add available storage capacity for index / collection, thus adding a node / shard - core that one can send new documents to be indexed? (of course, load-balancing would be trigered, so it looks like documents would be added to shards out of a set of nodes).


Charles VALLEE
Centre de compétence Big data

DATACENTER - Expertise en Energie Informatique (EEI)
32 avenue Pablo Picasso
92000 Nanterre
Tél. : + (0) 1 78 66 69 81
Un geste simple pour l'environnement, n'imprimez ce message que si vous en avez l'utilité.

De :
A :
Date :        06/01/2015 18:55
Objet :        Re: Solr on HDFS in a Hadoop cluster

Oh, and

Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support *

On Tue, Jan 6, 2015 at 12:52 PM, Otis Gospodnetic <> wrote:

> Hi Charles,
> See and
> Otis
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support *
> On Tue, Jan 6, 2015 at 11:02 AM, Charles VALLEE <>
> wrote:
>> I am considering using *Solr* to extend *Hortonworks Data Platform*
>> capabilities to search.
>> - I found tutorials to index documents into a Solr instance from *HDFS*,
>> but I guess this solution would require a Solr cluster distinct to the
>> Hadoop cluster. Is it possible to have a Solr integrated into the Hadoop
>> cluster instead? - *With the index stored in HDFS?*
>> - Where would the processing take place (could it be handed down to
>> Hadoop)? Is there a way to garantee a level of service (CPU, RAM) - to
>> integrate with *Yarn*?
>> - What about *SolrCloud*: what does it bring regarding Hadoop based
>> use-cases? Does it stand for a Solr-only cluster?
>> - Well, if that could lead to something working with a roles-based
>> authorization-compliant *Banana*, it would be Christmass again!
>> Thanks a lot for any help!
>> Charles

Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse.

Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message.

Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus.

This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval.

If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message.

E-mail communication cannot be guaranteed to be timely secure, error or virus-free.