helix-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kisho...@apache.org
Subject git commit: [HELIX-23] Tutorial documentation
Date Fri, 01 Feb 2013 00:07:05 GMT
Updated Branches:
  refs/heads/master 4cb709163 -> c44a2badf


[HELIX-23] Tutorial documentation


Project: http://git-wip-us.apache.org/repos/asf/incubator-helix/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-helix/commit/c44a2bad
Tree: http://git-wip-us.apache.org/repos/asf/incubator-helix/tree/c44a2bad
Diff: http://git-wip-us.apache.org/repos/asf/incubator-helix/diff/c44a2bad

Branch: refs/heads/master
Commit: c44a2badfc4766f1a9122045ec2dccbbe9166195
Parents: 4cb7091
Author: Kishore Gopalakrishna <g.kishore@gmail.com>
Authored: Thu Jan 31 16:06:51 2013 -0800
Committer: Kishore Gopalakrishna <g.kishore@gmail.com>
Committed: Thu Jan 31 16:06:51 2013 -0800

----------------------------------------------------------------------
 src/site/markdown/ApiUsage.md     |  285 ---------------------------
 src/site/markdown/Architecture.md |    1 +
 src/site/markdown/Features.md     |   38 +++-
 src/site/markdown/Quickstart.md   |    2 +-
 src/site/markdown/Tutorial.md     |  334 ++++++++++++++++++++++++++++++++
 src/site/markdown/index.md        |  126 +++++++++----
 src/site/site.xml                 |   19 ++-
 src/site/xdoc/download.xml.vm     |    4 +-
 8 files changed, 470 insertions(+), 339 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/c44a2bad/src/site/markdown/ApiUsage.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/ApiUsage.md b/src/site/markdown/ApiUsage.md
deleted file mode 100644
index 226d790..0000000
--- a/src/site/markdown/ApiUsage.md
+++ /dev/null
@@ -1,285 +0,0 @@
-<!---
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-  http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-
-
-# Create an instance of Manager
-The first step of using the Helix api will be creating a Helix manager instance. 
-It requires the following parameters:
- 
-* clusterName: A logical name to represent the group of nodes
-* instanceName: A logical name of the process creating the manager instance. Generally this is host:port.
-* instanceType: Type of the process. This can be one of the following types:
-    * CONTROLLER: Process that controls the cluster, any number of controllers can be started but only one will be active at any given time.
-    * PARTICIPANT: Process that performs the actual task in the distributed system. 
-    * SPECTATOR: Process that observes the changes in the cluster.
-    * ADMIN: To carry out system admin actions.
-* zkConnectString: Connection string to Zookeeper. This is of the form host1:port1,host2:port2,host3:port3. 
-
-```
-      manager = HelixManagerFactory.getZKHelixManager(clusterName,
-                                                      instanceName,
-                                                      instanceType,
-                                                      zkConnectString);
-```
-                                                      
-#Setting up a cluster
-Initial setup of a cluster, involves creating appropriate znodes in the zookeeper. 
-
-```
-    //Create setuptool instance
-    ClusterSetupTool setupTool = new ClusterSetupTool(zkConnectString);
-    //Create cluster namespace in zookeeper
-    setupTool.addCluster(clusterName, true);
-    //Add six Participant instances, each instance must have a unique id. host:port is the standard convention
-    String instances[] = new String[6];
-    for (int i = 0; i < storageInstanceInfoArray.length; i++)
-    {
-      instance[i] = "localhost:" + (8900 + i);
-    }
-    setupTool.addInstancesToCluster(clusterName, instances);
-    //add the resource with 10 partitions to the cluster. Using MasterSlave state model. 
-    //See the section on how to configure a application specific state model
-    setupTool.addResourceToCluster(clusterName, "TestDB", 10, "MasterSlave");
-    //This will do the assignment of partitions to instances. Assignment algorithm is based on consistent hashing and RUSH. 
-    //See how to do custom partition assignment
-    setupTool.rebalanceResource(clusterName, "TestDB", 3);
-```
-
-## Participant
-Starting up a participant is pretty straightforward. After the Helix manager instance is created, only thing that needs to be registered is the state model factory. 
-The Methods on the State Model will be called when controller sends transitions to the Participant.
-
-```
-      manager = HelixManagerFactory.getZKHelixManager(clusterName,
-                                                          instanceName,
-                                                          InstanceType.PARTICIPANT,
-                                                          zkConnectString);
-     StateMachineEngine stateMach = manager.getStateMachineEngine();
-     //create a stateModelFactory that returns a statemodel object for each partition. 
-     stateModelFactory = new OnlineOfflineStateModelFactory();     
-     stateMach.registerStateModelFactory(stateModelType, stateModelFactory);
-     manager.connect();
-```
-
-```
-public class OnlineOfflineStateModelFactory extends
-		StateModelFactory<StateModel> {
-	@Override
-	public StateModel createNewStateModel(String stateUnitKey) {
-		OnlineOfflineStateModel stateModel = new OnlineOfflineStateModel();
-		return stateModel;
-	}
-	@StateModelInfo(states = "{'OFFLINE','ONLINE'}", initialState = "OFFINE")
-	public static class OnlineOfflineStateModel extends StateModel {
-        @Transition(from = "OFFLINE", to = "ONLINE")
-		public void onBecomeOnlineFromOffline(Message message,
-				NotificationContext context) {
-			System.out
-					.println("OnlineOfflineStateModel.onBecomeOnlineFromOffline()");
-			//Application logic to handle transition 
-		}
-        @Transition(from = "ONLINE", to = "OFFLINE")
-		public void onBecomeOfflineFromOnline(Message message,
-				NotificationContext context) {
-			System.out
-						.println("OnlineOfflineStateModel.onBecomeOfflineFromOnline()");
-			//Application logic to handle transition
-		}
-	}
-}
-```
-
-## Controller Code
-Controller needs to know about all changes in the cluster. Helix comes with default implementation to handle all changes in the cluster. 
-If you have a need to add additional functionality, see GenericHelixController on how to configure the pipeline.
-
-
-```
-      manager = HelixManagerFactory.getZKHelixManager(clusterName,
-                                                          instanceName,
-                                                          InstanceType.CONTROLLER,
-                                                          zkConnectString);
-     manager.connect();
-     GenericHelixController controller = new GenericHelixController();
-     manager.addConfigChangeListener(controller);
-     manager.addLiveInstanceChangeListener(controller);
-     manager.addIdealStateChangeListener(controller);
-     manager.addExternalViewChangeListener(controller);
-     manager.addControllerListener(controller);
-```
-This above snippet shows how the controller is started. You can also start the controller using command line interface.
-  
-```
-cd helix
-mvn clean install -Dmaven.test.skip=true
-cd helix-core/target/helix-core-pkg/bin
-chmod +x *
-./run-helix-controller.sh --zkSvr <ZookeeperServerAddress(Required)>  --cluster <Cluster name (Required)>
-```
-
-## Spectator Code
-A spectator simply observes all cluster is notified when the state of the system changes. Helix consolidates the state of entire cluster in one Znode called ExternalView.
-Helix provides a default implementation RoutingTableProvider that caches the cluster state and updates it when there is a change in the cluster
-
-```
-manager = HelixManagerFactory.getZKHelixManager(clusterName,
-                                                          instanceName,
-                                                          InstanceType.PARTICIPANT,
-                                                          zkConnectString);
-manager.connect();
-RoutingTableProvider routingTableProvider = new RoutingTableProvider();
-manager.addExternalViewChangeListener(routingTableProvider);
-
-```
-
-In order to figure out who is serving a partition, here are the apis
-
-```
-instances = routingTableProvider.getInstances("DBNAME", "PARITION_NAME", "PARTITION_STATE");
-```
-
-##  Helix Admin operations
-Helix provides multiple ways to administer the cluster. It has a command line interface and also a REST interface.
-
-```
-cd helix
-mvn clean install -Dmaven.test.skip=true
-cd helix-core/target/helix-core-pkg/bin
-chmod +x *
-./helix-admin.sh --help
-Provide zookeeper address. Required for all commands  
-   --zkSvr <ZookeeperServerAddress(Required)>       
-
-Add a new cluster                                                          
-   --addCluster <clusterName>                              
-
-Add a new Instance to a cluster                                    
-   --addNode <clusterName InstanceAddress(host:port)>                                      
-
-Add a State model to a cluster                                     
-   --addStateModelDef <clusterName <filename>>    
-
-Add a resource to a cluster            
-   --addResource <clusterName resourceName partitionNum stateModelRef <mode(AUTO_REBALANCE|AUTO|CUSTOM)>>      
-
-Upload an IdealState(Partition to Node Mapping)                                         
-   --addIdealState <clusterName resourceName <filename>>            
-
-Delete a cluster
-   --dropCluster <clusterName>                                                                         
-
-Delete a resource
-   --dropResource <clusterName resourceName>                                                           Drop an existing resource from a cluster
-
-Drop an existing Instance from a cluster    
-   --dropNode <clusterName InstanceAddress(host:port)>                    
-
-Enable/disable the entire cluster, this will basically pause the controller which means no transitions will be trigger, but the existing node sin the cluster continue to function 
-   --enableCluster <clusterName>
-
-Enable/disable a Instance. Useful to take a faulty node out of the cluster.
-   --enableInstance <clusterName InstanceName true/false>
-
-Enable/disable a partition
-   --enablePartition <clusterName instanceName resourceName partitionName true/false>
-
-
-   --listClusterInfo <clusterName>                                                                     Query info of a cluster
-   --listClusters                                                                                      List existing clusters
-   --listInstanceInfo <clusterName InstanceName>                                                       Query info of a Instance in a cluster
-   --listInstances <clusterName>                                                                       List Instances in a cluster
-   --listPartitionInfo <clusterName resourceName partitionName>                                        Query info of a partition
-   --listResourceInfo <clusterName resourceName>                                                       Query info of a resource
-   --listResources <clusterName>                                                                       List resources hosted in a cluster
-   --listStateModel <clusterName stateModelName>                                                       Query info of a state model in a cluster
-   --listStateModels <clusterName>                                                                     Query info of state models in a cluster
-
-```
-
-## Idealstate modes and configuration
-
-
- * AUTO mode: Partition to Node assignment is pre-generated using consistent hashing 
-
-```
-  setupTool.addResourceToCluster(clusterName, resourceName, partitionNumber, "MasterSlave")
-  setupTool.rebalanceStorageCluster(clusterName, resourceName, replicas)
-```
-
- * AUTO_REBALANCE mode: Partition to Node assignment is generated dynamically by cluster manager based on the nodes that are currently up and running
-
-```
- setupTool.addResourceToCluster(clusterName, resourceName, partitionNumber, "MasterSlave", "AUTO_REBALANCE")
- setupTool.rebalanceStorageCluster(clusterName, resourceName, replicas)
-```
-
- * CUSTOMIZED mode: Allows one to set the is pre-generated from a JSON format file
-
- ```
- setupTool.addIdealState(clusterName, resourceName, idealStateJsonFile)
- ```
-
-
-
-## Configuring state model
-
-```
-StateModelConfigGenerator generator = new StateModelConfigGenerator();
-ZnRecord stateModelConfig = generator.generateConfigForOnlineOffline();
-StateModelDefinition stateModelDef = new StateModelDefinition(stateModelConfig);
-ClusterSetup setupTool = new ClusterSetup(zkConnectString);
-setupTool.addStateModelDef(cluster,stateModelName,stateModelDef);
-```
-
-See StateModelConfigGenerator to get more info on creating custom state model.
-
-## Messaging Api usage
-
-See BootstrapProcess.java in examples package to see how Participants can exchange messages with each other.
-
-```
-      ClusterMessagingService messagingService = manager.getMessagingService();
-      //CONSTRUCT THE MESSAGE
-      Message requestBackupUriRequest = new Message(
-          MessageType.USER_DEFINE_MSG, UUID.randomUUID().toString());
-      requestBackupUriRequest
-          .setMsgSubType(BootstrapProcess.REQUEST_BOOTSTRAP_URL);
-      requestBackupUriRequest.setMsgState(MessageState.NEW);
-      //SET THE RECIPIENT CRITERIA, All nodes that satisfy the criteria will receive the message
-      Criteria recipientCriteria = new Criteria();
-      recipientCriteria.setInstanceName("*");
-      recipientCriteria.setRecipientInstanceType(InstanceType.PARTICIPANT);
-      recipientCriteria.setResource("MyDB");
-      recipientCriteria.setPartition("");
-      //Should be processed only the process that is active at the time of sending the message. 
-      //This means if the recipient is restarted after message is sent, it will not be processed.
-      recipientCriteria.setSessionSpecific(true);
-      // wait for 30 seconds
-      int timeout = 30000;
-      //The handler that will be invoked when any recipient responds to the message.
-      BootstrapReplyHandler responseHandler = new BootstrapReplyHandler();
-      //This will return only after all recipients respond or after timeout.
-      int sentMessageCount = messagingService.sendAndWait(recipientCriteria,
-          requestBackupUriRequest, responseHandler, timeout);
-```
-
-For more details on MessagingService see ClusterMessagingService
-
-
-

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/c44a2bad/src/site/markdown/Architecture.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/Architecture.md b/src/site/markdown/Architecture.md
index cf14b47..bf7b4d0 100644
--- a/src/site/markdown/Architecture.md
+++ b/src/site/markdown/Architecture.md
@@ -216,6 +216,7 @@ The following picture shows how controllers, participants and spectators interac
 * After any task is completed by Participant, Controllers gets notified of the change and State Transition algorithm is re-run until the current state is same as Ideal State.
 
 ## Helix znode layout
+
 Helix organizes znodes under clusterName in multiple levels. 
 The top level (under clusterName) znodes are all Helix defined and in upper case
 * PROPERTYSTORE: application property store

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/c44a2bad/src/site/markdown/Features.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/Features.md b/src/site/markdown/Features.md
index 2457e8a..3ed36b8 100644
--- a/src/site/markdown/Features.md
+++ b/src/site/markdown/Features.md
@@ -18,10 +18,6 @@ under the License.
 -->
 
 
-As we started using Helix in production we found various things that were needed as part of most distributed data systems.
-
-These features have been implemented in a way that other systems can benefit.
-
 Partition Placement
 -------------------
 The placement of partitions in a DDS is very critical for reliability and scalability of the system. 
@@ -38,6 +34,7 @@ The partition assignment table can look like
 
     P1 -> {N1:M, N2:S}
     P2 -> {N1:S, N2:M}
+    
 This means Partition P1 must be a Master at N1 and Slave at N2 and vice versa for P2
 
 Helix provides multiple ways to control the partition placement. See Execution modes section for more info on this.
@@ -49,17 +46,17 @@ Helix uses this as the target state of the system and computes the appropriate t
 
 Helix supports 3 different execution modes which allows application to explicitly control the placement and state of the replica.
 
-AUTO_REBALANCE
+##### AUTO_REBALANCE
 When the idealstate mode is set to AUTO_REBALANCE, Helix controls both the location of the replica along with the state. This option is useful for applications where creation of a replica is not expensive. 
 A typical example is evenly distributing a group of tasks among the currently alive processes. For example, if there are 60 tasks and 4 nodes, Helix assigns 15 tasks to each node. 
 When one node fails Helix redistributes its 15 tasks to the remaining 3 nodes. Similarly, if a node is added, Helix re-allocates 3 tasks from each of the 4 nodes to the 5th node. 
 
-AUTO
+#### AUTO
 When the idealstate mode is set to AUTO, Helix only controls STATE of the replicas where as the location of the partition is controlled by application. 
 For example the application can say P1->{N1,N2,N3} which means P1 should only exist N1,N2,N3. In this mode when N1 fails, unlike in AUTO-REBALANCE mode the partition is not moved from N1 to others nodes in the cluster. 
 But Helix might decide to change the state of P1 in N2 and N3 based on the system constraints. For example, if a system constraint specified that there should be 1 Master and if the Master failed, then N2 will be made the master.
 
-CUSTOM
+#### CUSTOM
 Helix offers a third mode called CUSTOM, in which application can completely control the placement and state of each replica. Applications will have to implement an interface that Helix will invoke when the cluster state changes. 
 Within this callback, the application can recompute the partition assignment mapping. Helix will then issue transitions to get the system to the final state. Note that Helix will ensure that system constraints are not violated at any time.
 For example, the current state of the system might be P1 -> {N1:M,N2:S} and the application changes the ideal state to P2 -> {N1:S,N2:M}. Helix will not blindly issue M-S to N1 and S-M to N2 in parallel since it might result in a transient state where both N1 and N2 are masters.
@@ -113,6 +110,32 @@ Since Helix is aware of the global state of the system, it can send the message
 This is a very generic api and can also be used to schedule various periodic tasks in the cluster like data backups etc. 
 System Admins can also perform adhoc tasks like on demand backup or execute a system command(like rm -rf ;-)) across all nodes.
 
+```
+      ClusterMessagingService messagingService = manager.getMessagingService();
+      //CONSTRUCT THE MESSAGE
+      Message requestBackupUriRequest = new Message(
+          MessageType.USER_DEFINE_MSG, UUID.randomUUID().toString());
+      requestBackupUriRequest
+          .setMsgSubType(BootstrapProcess.REQUEST_BOOTSTRAP_URL);
+      requestBackupUriRequest.setMsgState(MessageState.NEW);
+      //SET THE RECIPIENT CRITERIA, All nodes that satisfy the criteria will receive the message
+      Criteria recipientCriteria = new Criteria();
+      recipientCriteria.setInstanceName("*");
+      recipientCriteria.setRecipientInstanceType(InstanceType.PARTICIPANT);
+      recipientCriteria.setResource("MyDB");
+      recipientCriteria.setPartition("");
+      //Should be processed only the process that is active at the time of sending the message. 
+      //This means if the recipient is restarted after message is sent, it will not be processed.
+      recipientCriteria.setSessionSpecific(true);
+      // wait for 30 seconds
+      int timeout = 30000;
+      //The handler that will be invoked when any recipient responds to the message.
+      BootstrapReplyHandler responseHandler = new BootstrapReplyHandler();
+      //This will return only after all recipients respond or after timeout.
+      int sentMessageCount = messagingService.sendAndWait(recipientCriteria,
+          requestBackupUriRequest, responseHandler, timeout);
+```
+
 See HelixManager.getMessagingService for more info.
 
 
@@ -148,7 +171,6 @@ This feature will be valuable in for distributed systems that support multi-tena
 This feature is not yet stable and do not recommend to be used in production.
 
 
-
 Controller deployment modes
 ---------------------------
 Read Architecture wiki for more details on the Role of a controller. In simple words, it basically controls the participants in the cluster by issuing transitions.

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/c44a2bad/src/site/markdown/Quickstart.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/Quickstart.md b/src/site/markdown/Quickstart.md
index 13e1015..992f73f 100644
--- a/src/site/markdown/Quickstart.md
+++ b/src/site/markdown/Quickstart.md
@@ -104,7 +104,7 @@ cd helix-core-pkg
      
 ##### start zookeeper locally at port 2199
 
-    ./start-standalone-zookeeper 2199 &
+    ./start-standalone-zookeeper.sh 2199 &
 
 ##### create the cluster mycluster
     ## helix-admin.sh --zkSvr localhost:2199 --addCluster <clustername> 

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/c44a2bad/src/site/markdown/Tutorial.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/Tutorial.md b/src/site/markdown/Tutorial.md
new file mode 100644
index 0000000..c7c1f23
--- /dev/null
+++ b/src/site/markdown/Tutorial.md
@@ -0,0 +1,334 @@
+<!---
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+Lets walk through the steps in building a distributed system using Helix.
+
+### Start zookeeper
+
+This starts a zookeeper in standalone mode. For production deployment, see [Apache Zookeeper] page for instructions.
+
+```
+    ./start-standalone-zookeeper.sh 2199 &
+```
+
+### Create cluster
+
+Creating a cluster will create appropriate znodes on zookeeper.   
+
+```
+    //Create setuptool instance
+    admin = new ZKHelixAdmin(ZK_ADDRESS);
+    String CLUSTER_NAME = "helix-demo";
+    //Create cluster namespace in zookeeper
+    admin.addCluster(clusterName, true);
+```
+
+OR
+
+```
+    ./helix-admin.sh --zkSvr localhost:2199 --addCluster helix-demo 
+```
+
+
+### Configure nodes
+
+Add new nodes to the cluster, configure new nodes in the cluster. Each node in the cluster must be uniquely identifiable. 
+Most commonly used convention is hostname:port.
+
+
+```
+    String CLUSTER_NAME = "helix-demo";
+    int NUM_NODES = 2;
+    String hosts[] = new String[]{"localhost","localhost"};
+    String ports[] = new String[]{7000,7001};
+    for (int i = 0; i < NUM_NODES; i++)
+    {
+      
+      InstanceConfig instanceConfig = new InstanceConfig(hosts[i]+ "_" + ports[i]);
+      instanceConfig.setHostName(hosts[i]);
+      instanceConfig.setPort(ports[i]);
+      instanceConfig.setInstanceEnabled(true);
+      //Add additional system specific configuration if needed. These can be accessed during the node start up.
+      instanceConfig.getRecord().setSimpleField("key", "value");
+      admin.addInstance(CLUSTER_NAME, instanceConfig);
+      
+    }
+
+```
+
+### Configure the resource
+
+Resource represents the actual task performed by the nodes. It can be a database, index, topic, queue or any other processing.
+A Resource can be divided into many sub parts called as partitions. 
+
+
+#### Define state model and constraints
+
+For scalability and fault tolerance each partition can have one or more replicas. 
+State model allows one to declare the system behavior by first enumerating the various STATES and TRANSITIONS between them.
+A simple model is ONLINE-OFFLINE where ONLINE means the task is active and OFFLINE means its not active.
+You can also specify how of replicas must be in each state. 
+For example In a Search System, one might need more than one node serving the same index. 
+Helix allows one to express this via constraints on each STATE.   
+
+The following snippet shows how to declare the state model and constraints for MASTER-SLAVE model.
+
+```
+
+    StateModelDefinition.Builder builder = new StateModelDefinition.Builder(
+        STATE_MODEL_NAME);
+    // Add states and their rank to indicate priority. Lower the rank higher the
+    // priority
+    builder.addState(MASTER, 1);
+    builder.addState(SLAVE, 2);
+    builder.addState(OFFLINE);
+    // Set the initial state when the node starts
+    builder.initialState(OFFLINE);
+
+    // Add transitions between the states.
+    builder.addTransition(OFFLINE, SLAVE);
+    builder.addTransition(SLAVE, OFFLINE);
+    builder.addTransition(SLAVE, MASTER);
+    builder.addTransition(MASTER, SLAVE);
+
+    // set constraints on states.
+    // static constraint
+    builder.upperBound(MASTER, 1);
+    // dynamic constraint, R means it should be derived based on the replica,
+    // this allows use different replication factor for each resource without 
+    //having to define a new state model
+    builder.dynamicUpperBound(SLAVE, "R");
+
+    StateModelDefinition statemodelDefinition = builder.build();
+    admin.addStateModelDef(CLUSTER_NAME, STATE_MODEL_NAME, myStateModel);
+   
+```
+
+
+
+ 
+#### Assigning partitions to nodes
+
+The final goal of Helix is to ensure that the constraints on the state model are satisfied. 
+Helix does this by assigning a STATE to a partition and placing it on a particular node.
+
+
+There are 3 assignment modes Helix can operate on
+
+* AUTO_REBALANCE: Helix decides the placement and state of a partition.
+* AUTO: Application decides the placement but Helix decides the state of a partition.
+* CUSTOM: Application controls the placement and state of a partition.
+
+For more info on the modes see the *partition placement* section on [Features](./Features.html) page.
+
+```
+    String RESOURCE_NAME="MyDB";
+    int NUM_PARTITIONs=6;
+    STATE_MODEL_NAME = "MasterSlave";
+    String MODE = "AUTO";
+    int NUM_REPLICAS = 2;
+    admin.addResource(CLUSTER_NAME, RESOURCE_NAME, NUM_PARTITIONS, STATE_MODEL_NAME, MODE);
+    admin.rebalance(CLUSTER_NAME, RESOURCE_NAME, NUM_REPLICAS);
+```
+
+### Starting a Helix based process
+
+The first step of using the Helix api will be creating a Helix manager instance. 
+It requires the following parameters:
+ 
+* clusterName: A logical name to represent the group of nodes
+* instanceName: A logical name of the process creating the manager instance. Generally this is host:port.
+* instanceType: Type of the process. This can be one of the following types:
+    * CONTROLLER: Process that controls the cluster, any number of controllers can be started but only one will be active at any given time.
+    * PARTICIPANT: Process that performs the actual task in the distributed system. 
+    * SPECTATOR: Process that observes the changes in the cluster.
+    * ADMIN: To carry out system admin actions.
+* zkConnectString: Connection string to Zookeeper. This is of the form host1:port1,host2:port2,host3:port3. 
+
+```
+      manager = HelixManagerFactory.getZKHelixManager(clusterName,
+                                                      instanceName,
+                                                      instanceType,
+                                                      zkConnectString);
+```
+                                                      
+
+
+### Participant
+Starting up a participant is pretty straightforward. After the Helix manager instance is created, only thing that needs to be registered is the state model factory. 
+The Methods on the State Model will be called when controller sends transitions to the Participant.
+
+```
+      manager = HelixManagerFactory.getZKHelixManager(clusterName,
+                                                          instanceName,
+                                                          InstanceType.PARTICIPANT,
+                                                          zkConnectString);
+     StateMachineEngine stateMach = manager.getStateMachineEngine();
+     //create a stateModelFactory that returns a statemodel object for each partition. 
+     stateModelFactory = new OnlineOfflineStateModelFactory();     
+     stateMach.registerStateModelFactory(stateModelType, stateModelFactory);
+     manager.connect();
+```
+
+```
+public class OnlineOfflineStateModelFactory extends
+        StateModelFactory<StateModel> {
+    @Override
+    public StateModel createNewStateModel(String stateUnitKey) {
+        OnlineOfflineStateModel stateModel = new OnlineOfflineStateModel();
+        return stateModel;
+    }
+    @StateModelInfo(states = "{'OFFLINE','ONLINE'}", initialState = "OFFINE")
+    public static class OnlineOfflineStateModel extends StateModel {
+        @Transition(from = "OFFLINE", to = "ONLINE")
+        public void onBecomeOnlineFromOffline(Message message,
+                NotificationContext context) {
+            System.out
+                    .println("OnlineOfflineStateModel.onBecomeOnlineFromOffline()");
+            //Application logic to handle transition 
+        }
+        @Transition(from = "ONLINE", to = "OFFLINE")
+        public void onBecomeOfflineFromOnline(Message message,
+                NotificationContext context) {
+            System.out
+                        .println("OnlineOfflineStateModel.onBecomeOfflineFromOnline()");
+            //Application logic to handle transition
+        }
+    }
+}
+```
+
+### Controller Code
+Controller needs to know about all changes in the cluster. Helix comes with default implementation to handle all changes in the cluster. 
+If you have a need to add additional functionality, see GenericHelixController on how to configure the pipeline.
+
+
+```
+      manager = HelixManagerFactory.getZKHelixManager(clusterName,
+                                                          instanceName,
+                                                          InstanceType.CONTROLLER,
+                                                          zkConnectString);
+     manager.connect();
+     GenericHelixController controller = new GenericHelixController();
+     manager.addConfigChangeListener(controller);
+     manager.addLiveInstanceChangeListener(controller);
+     manager.addIdealStateChangeListener(controller);
+     manager.addExternalViewChangeListener(controller);
+     manager.addControllerListener(controller);
+```
+This above snippet shows how the controller is started. You can also start the controller using command line interface.
+  
+```
+cd helix
+mvn clean install -Dmaven.test.skip=true
+cd helix-core/target/helix-core-pkg/bin
+chmod +x *
+./run-helix-controller.sh --zkSvr <ZookeeperServerAddress(Required)>  --cluster <Cluster name (Required)>
+```
+See controller deployment modes section in [Features](./Features.html) page for different ways to deploy the controller.
+
+### Spectator Code
+A spectator simply observes all cluster is notified when the state of the system changes. Helix consolidates the state of entire cluster in one Znode called ExternalView.
+Helix provides a default implementation RoutingTableProvider that caches the cluster state and updates it when there is a change in the cluster
+
+```
+manager = HelixManagerFactory.getZKHelixManager(clusterName,
+                                                          instanceName,
+                                                          InstanceType.PARTICIPANT,
+                                                          zkConnectString);
+manager.connect();
+RoutingTableProvider routingTableProvider = new RoutingTableProvider();
+manager.addExternalViewChangeListener(routingTableProvider);
+
+```
+
+In order to figure out who is serving a partition, here are the apis
+
+```
+instances = routingTableProvider.getInstances("DBNAME", "PARITION_NAME", "PARTITION_STATE");
+```
+
+### Zookeeper znode layout.
+
+See  *Helix znode layout* section in [Architecture](./Architecture.html) page.
+
+
+###  Helix Admin operations
+
+Helix provides multiple ways to administer the cluster. It has a command line interface and also a REST interface.
+
+```
+cd helix
+mvn clean install -Dmaven.test.skip=true
+cd helix-core/target/helix-core-pkg/bin
+chmod +x *
+./helix-admin.sh --help
+Provide zookeeper address. Required for all commands  
+   --zkSvr <ZookeeperServerAddress(Required)>       
+
+Add a new cluster                                                          
+   --addCluster <clusterName>                              
+
+Add a new Instance to a cluster                                    
+   --addNode <clusterName InstanceAddress(host:port)>                                      
+
+Add a State model to a cluster                                     
+   --addStateModelDef <clusterName <filename>>    
+
+Add a resource to a cluster            
+   --addResource <clusterName resourceName partitionNum stateModelRef <mode(AUTO_REBALANCE|AUTO|CUSTOM)>>      
+
+Upload an IdealState(Partition to Node Mapping)                                         
+   --addIdealState <clusterName resourceName <filename>>            
+
+Delete a cluster
+   --dropCluster <clusterName>                                                                         
+
+Delete a resource
+   --dropResource <clusterName resourceName>                                                           Drop an existing resource from a cluster
+
+Drop an existing Instance from a cluster    
+   --dropNode <clusterName InstanceAddress(host:port)>                    
+
+Enable/disable the entire cluster, this will basically pause the controller which means no transitions will be trigger, but the existing node sin the cluster continue to function 
+   --enableCluster <clusterName>
+
+Enable/disable a Instance. Useful to take a faulty node out of the cluster.
+   --enableInstance <clusterName InstanceName true/false>
+
+Enable/disable a partition
+   --enablePartition <clusterName instanceName resourceName partitionName true/false>
+
+
+   --listClusterInfo <clusterName>                                                                     Query info of a cluster
+   --listClusters                                                                                      List existing clusters
+   --listInstanceInfo <clusterName InstanceName>                                                       Query info of a Instance in a cluster
+   --listInstances <clusterName>                                                                       List Instances in a cluster
+   --listPartitionInfo <clusterName resourceName partitionName>                                        Query info of a partition
+   --listResourceInfo <clusterName resourceName>                                                       Query info of a resource
+   --listResources <clusterName>                                                                       List resources hosted in a cluster
+   --listStateModel <clusterName stateModelName>                                                       Query info of a state model in a cluster
+   --listStateModels <clusterName>                                                                     Query info of state models in a cluster
+
+```
+
+
+
+
+

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/c44a2bad/src/site/markdown/index.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/index.md b/src/site/markdown/index.md
index d1fd288..99504eb 100644
--- a/src/site/markdown/index.md
+++ b/src/site/markdown/index.md
@@ -45,29 +45,35 @@ Helix is a generic cluster management framework used for the automatic managemen
 -----
 
 OVERVIEW
--------------
-Helix uses terms that are commonly used to describe distributed data system concepts. 
+---------
 
-1. Cluster: A logical set of Instances that perform a similar set of activities. 
-2. Instance: An Instance is a logical entity in the cluster that can be identified by a unique Id. 
-3. Node: A Node is a physical entity in the cluster. A Node can have one or more logical Instances. 
-4. Resource: A resource represents the logical entity hosted by the distributed system. It can be a database name, index or a task group name 
-5. Partition: A resource is generally split into one or more partitions.
-6. Replica: Each partition can have one or more replicas
-7. State: Each replica can have state associated with it. For example: Master, Slave, Leader, Stand By, Offline, Online etc. 
+A distributed system comprises of one or more *nodes*. Depending on the purpose, each node performs a specific task. For example, in a search system it can be a index, in a pub sub system it can be a topic/queue, in storage system it can be a database. Helix refers to such tasks as a *resource*. In order to scale the system, each node is responsible for a part of task referred to as *partition*. For scalability and fault tolerance, task associated with each partition can run on multiple nodes. Helix refers to them as *replica*. 
+ 
+Helix refers to each of the node in the cluster as a *PARTICIPANT*. As seen in many distributed system, there is a central component called *CONTROLLER* that co-ordinates the *PARTICIPANT*'s  during start up, failures and cluster expansion. In most distributed systems need to provide a service discovery mechanism for external entities like clients, request routers, load balancers to interact with the distributed system. These external entities are referred as SPECTATOR.
 
-To summarize, a resource (database, index or any task) in general is partitioned, replicated and distributed among the Instance/nodes in the cluster and each partition has a state associated with it. 
+Helix is built on top of Zookeeper and uses it store the cluster state and serves as the communication channel between CONTROLLER, PARTICIPANT and spectator. There is no single point of failure in Helix.
 
-Helix manages the state of a resource by supporting a pluggable distributed state machine. One can define the state machine table along with the constraints for each state. 
+Helix managed distributed system architecture.
 
-Here are some common state models used
+![Helix Design](images/HELIX-components.png)
 
-1. Master, Slave
-2. Online, Offline
-3. Leader, Standby.
 
-For example in the case of a MasterSlave state model one can specify the state machine as follows. The table says given a start state and an end state what should be the next state. 
-For example, if the current state is Offline and the target state is Master, the table says that the next state is Slave.  So in this case, Helix issues an Offline-Slave transition
+WHAT MAKES IT GENERIC
+---------------------
+
+Even though most distributed systems follow similar mechanism of co-ordinating the nodes through a controller or zookeeper, the implementation is 
+specific to the use case. Helix abstracts out the cluster management of distributed system from its core functionality. 
+
+Helix allows one to express the system behavior via Pluggable Finite State Machine 
+
+Consider the simple use cases where all partitions are actively processing search query request. 
+We can express it using a OnlineOffline state model where a task can be either 
+ONLINE (task is active) or OFFLINE (not active).
+
+Similarly take a slightly more complicated system, where we need three states OFFLINE, SLAVE and MASTER. 
+
+The following state machine table provides transition from start state to End state. For example, if the current state is Offline and the target state is Master,
+ the table says that the first transition must be Offline-Slave and then Slave-Master.
 
 ```
           OFFLINE  | SLAVE  |  MASTER  
@@ -84,25 +90,37 @@ MASTER  | SLAVE    | SLAVE  |   N/A   |
 
 ```
 
-Helix also supports the ability to provide constraints on each state. For example in a MasterSlave state model with a replication factor of 3 one can say 
+
+Another unique feature of Helix is it allows one to add constraints on each state and transitions. 
+
+For example 
+In a OnlineOffline state model one can enforce a constraint that there should be 3 replicas in ONLINE state per partition.
+
+    ONLINE:3
+
+In a MasterSlave state model with a replication factor of 3 one can enforce a single master by specifying constraints on number of Masters and Slaves.
 
     MASTER:1 
     SLAVE:2
 
-Helix will automatically maintain 1 Master and 2 Slaves by initiating appropriate state transitions on each instance in the cluster. 
+Given these constraints, Helix will ensure that there is 1 Master and 2 Slaves by initiating appropriate state transitions in the cluster.
 
-Each transition results in a partition moving from its CURRENT state to a NEW state. These transitions are triggered on changes in the cluster state like 
 
-* Node start up
-* Node soft and hard failures 
-* Addition of resources
-* Addition of nodes
+Apart from Constraints on STATES, Helix supports constraints on transitions as well. For example, consider a OFFLINE-BOOTSTRAP transition where a service download the index over the network. 
+Without any throttling during start up of a cluster, all nodes might start downloading at once which might impact the system stability. 
+Using Helix with out changing any application code, one can simply place a constraint of max 5 transitions OFFLINE-BOOTSTRAP across the entire cluster.
+
+The constraints can be at any scope node, resource, transition type and 
+
+Helix comes with 3 commonly used state models, you can also plugin your custom state model. 
+
+1. Master, Slave
+2. Online, Offline
+3. Leader, Standby.
 
-In simple words, Helix is a distributed state machine with support for constraints on each state.
 
 Helix framework can be used to build distributed, scalable, elastic and fault tolerant systems by configuring the distributed state machine and its constraints based on application requirements. The application has to provide the implementation for handling state transitions appropriately. Example 
 
-Once the state machine and constraints are configured through Helix, application will have the provide implementation to handle the transitions appropriately.  
 
 ```
 MasterSlaveStateModel extends HelixStateModel {
@@ -122,24 +140,60 @@ MasterSlaveStateModel extends HelixStateModel {
 }
 ```
 
-Once the state machine is configured, the framework allows one to 
+Each transition results in a partition moving from its CURRENT state to a NEW state. These transitions are triggered on changes in the cluster state like 
+
+* Node start up
+* Node soft and hard failures 
+* Addition of resources
+* Addition of nodes
+
+
+TERMINOLOGIES
+-------------
+Helix uses terms that are commonly used to describe distributed data system concepts. 
+
+1. Cluster: A logical set of Instances that perform a similar set of activities. 
+2. Instance: An Instance is a logical entity in the cluster that can be identified by a unique Id. 
+3. Node: A Node is a physical entity in the cluster. A Node can have one or more logical Instances. 
+4. Resource: A resource represents the logical entity hosted by the distributed system. It can be a database name, index or a task group name 
+5. Partition: A resource is generally split into one or more partitions.
+6. Replica: Each partition can have one or more replicas
+7. State: Each replica can have state associated with it. For example: Master, Slave, Leader, Stand By, Offline, Online etc. 
 
-* Dynamically add nodes to the cluster
-* Automatically modify the topology(rebalance partitions) of the cluster  
-* Dynamically add resources to the cluster
-* Enable/disable partition/instances for software upgrade without impacting availability.
 
-Helix uses Zookeeper for maintaining the cluster state and change notifications.
 
 WHY HELIX
 -------------
-Helix approach of using a distributed state machine with constraints on state and transitions has benefited us in multiple ways.
+Helix approach of using a distributed state machine with constraints on state and transitions has the following benefits
 
-* Abstract cluster management aspects from the core functionality of DDS.
-* Each node in DDS is not aware of the global state since they simply have to follow . This proved quite useful since we could deploy the same system in different topologies.
+* Abstract cluster management from the core functionality.
+* Quick transformation from a single node system to a distributed system.
+* PARTICIPANT is not aware of the global state since they simply have to follow the instructions issued by the CONTROLLER. This design provide clear division of responsibilities and easier to debug issues.
 * Since the controller's goal is to satisfy state machine constraints at all times, use cases like cluster startup, node failure, cluster expansion are solved in a similar way.
 
-At LinkedIn, we have been able to use this to manage 3 different distributed systems that look very different on paper.  
+
+BUILD INSTRUCTIONS
+-------------------------
+
+Requirements: Jdk 1.6+, Maven 2.0.8+
+
+```
+    git clone https://git-wip-us.apache.org/repos/asf/incubator-helix.git
+    cd incubator-helix
+    mvn install package -DskipTests 
+```
+
+Maven dependency
+
+```
+    <dependency>
+      <groupId>org.apache.helix</groupId>
+      <artifactId>helix-core</artifactId>
+      <version>0.6.0-incubating</version>
+    </dependency>
+```
+
+[Download](./download.html) Helix artifacts from here.
    
 PUBLICATIONS
 -------------

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/c44a2bad/src/site/site.xml
----------------------------------------------------------------------
diff --git a/src/site/site.xml b/src/site/site.xml
index 981c090..1db3303 100644
--- a/src/site/site.xml
+++ b/src/site/site.xml
@@ -40,13 +40,12 @@
     </breadcrumbs>
 
     <menu name="Helix">
-      <item name="About" href="./index.html"/>
+      <item name="Introduction" href="./index.html"/>
       <item name="Quick Start" href="./Quickstart.html"/>
-      <item name="Api Usage" href="./ApiUsage.html"/>
+      <item name="Tutorial" href="./Tutorial.html"/>
       <item name="Architecture" href="./Architecture.html"/>
       <item name="Features" href="./Features.html"/>
-      <item name="Current release ${currentRelease}" href="releasenotes/release-${currentRelease}.html"/>
-      <item name="Current Version Change Log" href="./jira-report.html"/>
+      <item name="release ${currentRelease}" href="releasenotes/release-${currentRelease}.html"/>
       <item name="Download" href="./download.html"/>
     </menu>
     
@@ -65,9 +64,10 @@
       <item name="Building Guide" href="/involved/building.html"/>
       <item name="Release Guide" href="/releasing.html"/>
     </menu>
-
+<!--
     <menu ref="reports" inherit="bottom"/>
-    <!--<menu ref="modules" inherit="bottom"/>-->
+    <menu ref="modules" inherit="bottom"/>
+
 
     <menu name="ASF">
       <item name="How Apache Works" href="http://www.apache.org/foundation/how-it-works.html"/>
@@ -75,7 +75,7 @@
       <item name="Sponsoring Apache" href="http://www.apache.org/foundation/sponsorship.html"/>
       <item name="Thanks" href="http://www.apache.org/foundation/thanks.html"/>
     </menu>
-
+-->
 
   </body>
 
@@ -84,6 +84,11 @@
       <topBarEnabled>true</topBarEnabled>
       <sideBarEnabled>true</sideBarEnabled>
       <googleSearch></googleSearch>
+      <twitter>
+        <user>apachehelix</user>
+        <showUser>true</showUser>
+        <showFollowers>false</showFollowers>
+      </twitter>
     </fluidoSkin>
   </custom>
 

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/c44a2bad/src/site/xdoc/download.xml.vm
----------------------------------------------------------------------
diff --git a/src/site/xdoc/download.xml.vm b/src/site/xdoc/download.xml.vm
index cf4bb57..6671276 100644
--- a/src/site/xdoc/download.xml.vm
+++ b/src/site/xdoc/download.xml.vm
@@ -44,8 +44,8 @@ under the License.
     </section>
 
     <section name="Current Release">
-      <p>Release date: ?</p>
-      <p><a href="releasenotes/release-${currentRelease}.html">Release notes</a></p>
+      <p>Release date: 01/29/2013 </p>
+      <p><a href="releasenotes/release-${currentRelease}.html">0.6.0-incubating Release notes</a></p>
       <a name="mirror"/>
       <subsection name="Mirror">
 


Mime
View raw message