Modified: incubator/helix/site-content/releasing.html URL: http://svn.apache.org/viewvc/incubator/helix/site-content/releasing.html?rev=1519752&r1=1519751&r2=1519752&view=diff ============================================================================== --- incubator/helix/site-content/releasing.html (original) +++ incubator/helix/site-content/releasing.html Tue Sep 3 16:43:37 2013 @@ -1,6 +1,6 @@ @@ -8,7 +8,7 @@ - + Apache Helix - Helix release process @@ -87,6 +87,9 @@
  • Distributed task DAG Execution
  • + +
  • User-Defined Rebalancer Example +
  • Last Published: 2013-08-28
  • +
  • Last Published: 2013-09-03
  • Modified: incubator/helix/site-content/sonar.html URL: http://svn.apache.org/viewvc/incubator/helix/site-content/sonar.html?rev=1519752&r1=1519751&r2=1519752&view=diff ============================================================================== --- incubator/helix/site-content/sonar.html (original) +++ incubator/helix/site-content/sonar.html Tue Sep 3 16:43:37 2013 @@ -1,13 +1,13 @@ - + Apache Helix - Sonar @@ -86,6 +86,9 @@
  • Distributed task DAG Execution
  • + +
  • User-Defined Rebalancer Example +
  • Last Published: 2013-08-28
  • +
  • Last Published: 2013-09-03
  • Modified: incubator/helix/site-content/source-repository.html URL: http://svn.apache.org/viewvc/incubator/helix/site-content/source-repository.html?rev=1519752&r1=1519751&r2=1519752&view=diff ============================================================================== --- incubator/helix/site-content/source-repository.html (original) +++ incubator/helix/site-content/source-repository.html Tue Sep 3 16:43:37 2013 @@ -1,13 +1,13 @@ - + Apache Helix - Source Repository @@ -86,6 +86,9 @@
  • Distributed task DAG Execution
  • + +
  • User-Defined Rebalancer Example +
  • Last Published: 2013-08-28
  • +
  • Last Published: 2013-09-03
  • Modified: incubator/helix/site-content/team-list.html URL: http://svn.apache.org/viewvc/incubator/helix/site-content/team-list.html?rev=1519752&r1=1519751&r2=1519752&view=diff ============================================================================== --- incubator/helix/site-content/team-list.html (original) +++ incubator/helix/site-content/team-list.html Tue Sep 3 16:43:37 2013 @@ -1,13 +1,13 @@ - + Apache Helix - Team list @@ -86,6 +86,9 @@
  • Distributed task DAG Execution
  • + +
  • User-Defined Rebalancer Example +
  • Last Published: 2013-08-28
  • +
  • Last Published: 2013-09-03
  • Modified: incubator/helix/site-content/tutorial_admin.html URL: http://svn.apache.org/viewvc/incubator/helix/site-content/tutorial_admin.html?rev=1519752&r1=1519751&r2=1519752&view=diff ============================================================================== --- incubator/helix/site-content/tutorial_admin.html (original) +++ incubator/helix/site-content/tutorial_admin.html Tue Sep 3 16:43:37 2013 @@ -1,13 +1,13 @@ - + Apache Helix - @@ -86,6 +86,9 @@
  • Distributed task DAG Execution
  • + +
  • User-Defined Rebalancer Example +
  • Last Published: 2013-08-28
  • +
  • Last Published: 2013-09-03
  • @@ -193,53 +196,454 @@ software distributed under the License i "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations -under the License. -->

    Helix Tutorial: Admin Operations

    Helix provides interfaces for the operator to administer the cluster. For convenience, there is a command line interface as well as a REST interface.

    Helix Admin operations

    First, make sure you get to the command-line tool, or include it in your shell PATH.

    -
    cd helix/helix-core/target/helix-core-pkg/bin
    -

    Get help

    -
    ./helix-admin.sh --help
    +under the License. -->

    Helix Tutorial: Admin Operations

    Helix provides a set of admin api for cluster management operations. They are supported via:

    +
      +
    • Java API
    • +
    • Commandline interface
    • +
    • REST interface via helix-admin-webapp
    • +

    Java API

    See interface org.apache.helix.HelixAdmin

    Command-line interface

    The command-line tool comes with helix-core package:

    Get the command-line tool:

    +
      - git clone https://git-wip-us.apache.org/repos/asf/incubator-helix.git
    +  - cd incubator-helix
    +  - ./build
    +  - cd helix-core/target/helix-core-pkg/bin
    +  - chmod +x *.sh
    +

    Get help:

    +
      - ./helix-admin.sh --help
     

    All other commands have this form:

    -
    ./helix-admin.sh --zkSvr <ZookeeperServerAddress (Required)> <command> <parameters>
    -

    Now, here are the admin commands:

    Add a new cluster

    -
       --addCluster <clusterName>                              
    -

    Add a new Instance to a cluster

    -
       --addNode <clusterName> <InstanceAddress (host:port)>
    -

    Add a State model to a cluster WE NEED A SPEC FOR A VALID STATE MODEL

    -
       --addStateModelDef <clusterName> <filename>>    
    -

    Add a resource to a cluster

    -
       --addResource <clusterName> <resourceName> <partitionNum> <stateModelRef> <mode (AUTO_REBALANCE|AUTO|CUSTOM)>
    -

    Upload an IdealState (Partition to Node Mapping) WE NEED A SPEC FOR A VALID IDEAL STATE

    -
       --addIdealState <clusterName> <resourceName> <filename>
    -

    Delete a cluster

    -
       --dropCluster <clusterName>                                                                         
    -

    Delete a resource (drop an existing resource from a cluster)

    -
       --dropResource <clusterName> <resourceName>
    -

    Drop an existing instance from a cluster

    -
       --dropNode <clusterName> <InstanceAddress (host:port)>
    -

    Enable/disable the entire cluster. This will pause the controller, which means no transitions will be trigger, but the existing nodes in the cluster continue to function, but without any management by the controller.

    -
       --enableCluster <clusterName> <true/false>
    -

    Enable/disable an instance. Useful to take a node out of the cluster for maintenance/upgrade.

    -
       --enableInstance <clusterName> <InstanceName> <true/false>
    -

    Enable/disable a partition

    -
       --enablePartition <clusterName> <instanceName> <resourceName> <partitionName> <true/false>
    -

    Query info of a cluster

    -
       --listClusterInfo <clusterName>
    -

    List existing clusters (remember, Helix can manage multiple clusters)

    -
       --listClusters
    -

    Query info of a single Instance in a cluster

    -
       --listInstanceInfo <clusterName> <InstanceName>
    -

    List instances in a cluster

    -
       --listInstances <clusterName>
    -

    Query info of a partition

    -
       --listPartitionInfo <clusterName> <resourceName> <partitionName>
    -

    Query info of a resource

    -
       --listResourceInfo <clusterName> <resourceName>
    -

    List resources hosted in a cluster

    -
       --listResources <clusterName>
    -

    Query info of a state model in a cluster

    -
       --listStateModel <clusterName> <stateModelName>
    -

    Query info of state models in a cluster

    -
       --listStateModels <clusterName>                                                                     
    -
    +
      ./helix-admin.sh --zkSvr <ZookeeperServerAddress> <command> <parameters>
    +

    Admin commands and brief description:

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Command syntax Description
    --activateCluster <clusterName controllerCluster true/false> Enable/disable a cluster in distributed controller mode
    --addCluster <clusterName> Add a new cluster
    --addIdealState <clusterName resourceName fileName.json> Add an ideal state to a cluster
    --addInstanceTag <clusterName instanceName tag> Add a tag to an instance
    --addNode <clusterName instanceId> Add an instance to a cluster
    --addResource <clusterName resourceName partitionNumber stateModelName> Add a new resource to a cluster
    --addResourceProperty <clusterName resourceName propertyName propertyValue> Add a resource property
    --addStateModelDef <clusterName fileName.json> Add a State model definition to a cluster
    --dropCluster <clusterName> Delete a cluster
    --dropNode <clusterName instanceId> Remove a node from a cluster
    --dropResource <clusterName resourceName> Remove an existing resource from a cluster
    --enableCluster <clusterName true/false> Enable/disable a cluster
    --enableInstance <clusterName instanceId true/false> Enable/disable an instance
    --enablePartition <true/false clusterName nodeId resourceName partitionName> Enable/disable a partition
    --getConfig <configScope configScopeArgs configKeys> Get user configs
    --getConstraints <clusterName constraintType> Get constraints
    --help print help information
    --instanceGroupTag <instanceTag> Specify instance group tag, used with rebalance command
    --listClusterInfo <clusterName> Show information of a cluster
    --listClusters List all clusters
    --listInstanceInfo <clusterName instanceId> Show information of an instance
    --listInstances <clusterName> List all instances in a cluster
    --listPartitionInfo <clusterName resourceName partitionName> Show information of a partition
    --listResourceInfo <clusterName resourceName> Show information of a resource
    --listResources <clusterName> List all resources in a cluster
    --listStateModel <clusterName stateModelName> Show information of a state model
    --listStateModels <clusterName> List all state models in a cluster
    --maxPartitionsPerNode <maxPartitionsPerNode> Specify the max partitions per instance, used with addResourceGroup command
    --rebalance <clusterName resourceName replicas> Rebalance a resource
    --removeConfig <configScope configScopeArgs configKeys> Remove user configs
    --removeConstraint <clusterName constraintType constraintId> Remove a constraint
    --removeInstanceTag <clusterName instanceId tag> Remove a tag from an instance
    --removeResourceProperty <clusterName resourceName propertyName> Remove a resource property
    --resetInstance <clusterName instanceId> Reset all erroneous partitions on an instance
    --resetPartition <clusterName instanceId resourceName partitionName> Reset an erroneous partition
    --resetResource <clusterName resourceName> Reset all erroneous partitions of a resource
    --setConfig <configScope configScopeArgs configKeyValueMap> Set user configs
    --setConstraint <clusterName constraintType constraintId constraintKeyValueMap> Set a constraint
    --swapInstance <clusterName oldInstance newInstance> Swap an old instance with a new instance
    --zkSvr <ZookeeperServerAddress> Provide zookeeper address

    REST interface

    The REST interface comes wit helix-admin-webapp package:

    +
      - git clone https://git-wip-us.apache.org/repos/asf/incubator-helix.git
    +  - cd incubator-helix 
    +  - ./build
    +  - cd helix-admin-webapp/target/helix-admin-webapp-pkg/bin
    +  - chmod +x *.sh
    +  - ./run-rest-admin.sh --zkSvr <zookeeperAddress> --port <port> // make sure zookeeper is running
    +

    URL and support methods

    +
      +
    • /clusters

      +
        +
      • List all clusters
      • +
      +
        curl http://localhost:8100/clusters
      +
      +
        +
      • Add a cluster
      • +
      +
        curl -d 'jsonParameters={"command":"addCluster","clusterName":"MyCluster"}' -H "Content-Type: application/json" http://localhost:8100/clusters
      +
    • +
    • /clusters/{clusterName}

      +
        +
      • List cluster information
      • +
      +
        curl http://localhost:8100/clusters/MyCluster
      +
      +
        +
      • Enable/disable a cluster in distributed controller mode
      • +
      +
        curl -d 'jsonParameters={"command":"activateCluster","grandCluster":"MyControllerCluster","enabled":"true"}' -H "Content-Type: application/json" http://localhost:8100/clusters/MyCluster
      +
      +
        +
      • Remove a cluster
      • +
      +
        curl -X DELETE http://localhost:8100/clusters/MyCluster
      +
    • +
    • /clusters/{clusterName}/resourceGroups

      +
        +
      • List all resources in a cluster
      • +
      +
        curl http://localhost:8100/clusters/MyCluster/resourceGroups
      +
      +
        +
      • Add a resource to cluster
      • +
      +
        curl -d 'jsonParameters={"command":"addResource","resourceGroupName":"MyDB","partitions":"8","stateModelDefRef":"MasterSlave" }' -H "Content-Type: application/json" http://localhost:8100/clusters/MyCluster/resourceGroups
      +
    • +
    • /clusters/{clusterName}/resourceGroups/{resourceName}

      +
        +
      • List resource information
      • +
      +
        curl http://localhost:8100/clusters/MyCluster/resourceGroups/MyDB
      +
      +
        +
      • Drop a resource
      • +
      +
        curl -X DELETE http://localhost:8100/clusters/MyCluster/resourceGroups/MyDB
      +
      +
        +
      • Reset all erroneous partitions of a resource
      • +
      +
        curl -d 'jsonParameters={"command":"resetResource"}' -H "Content-Type: application/json" http://localhost:8100/clusters/MyCluster/resourceGroups/MyDB
      +
    • +
    • /clusters/{clusterName}/resourceGroups/{resourceName}/idealState

      +
        +
      • Rebalance a resource
      • +
      +
        curl -d 'jsonParameters={"command":"rebalance","replicas":"3"}' -H "Content-Type: application/json" http://localhost:8100/clusters/MyCluster/resourceGroups/MyDB/idealState
      +
      +
        +
      • Add an ideal state
      • +
      +
      echo jsonParameters={
      +"command":"addIdealState"
      +   }&newIdealState={
      +  "id" : "MyDB",
      +  "simpleFields" : {
      +    "IDEAL_STATE_MODE" : "AUTO",
      +    "NUM_PARTITIONS" : "8",
      +    "REBALANCE_MODE" : "SEMI_AUTO",
      +    "REPLICAS" : "0",
      +    "STATE_MODEL_DEF_REF" : "MasterSlave",
      +    "STATE_MODEL_FACTORY_NAME" : "DEFAULT"
      +  },
      +  "listFields" : {
      +  },
      +  "mapFields" : {
      +    "MyDB_0" : {
      +      "localhost_1001" : "MASTER",
      +      "localhost_1002" : "SLAVE"
      +    }
      +  }
      +}
      +> newIdealState.json
      +curl -d @'./newIdealState.json' -H 'Content-Type: application/json' http://localhost:8100/clusters/MyCluster/resourceGroups/MyDB/idealState
      +
      +
        +
      • Add resource property
      • +
      +
        curl -d 'jsonParameters={"command":"addResourceProperty","REBALANCE_TIMER_PERIOD":"500"}' -H "Content-Type: application/json" http://localhost:8100/clusters/MyCluster/resourceGroups/MyDB/idealState
      +
    • +
    • /clusters/{clusterName}/resourceGroups/{resourceName}/externalView

      +
        +
      • Show resource external view
      • +
      +
        curl http://localhost:8100/clusters/MyCluster/resourceGroups/MyDB/externalView
      +
    • +
    • /clusters/{clusterName}/instances

      +
        +
      • List all instances
      • +
      +
        curl http://localhost:8100/clusters/MyCluster/instances
      +
      +
        +
      • Add an instance
      • +
      +
      curl -d 'jsonParameters={"command":"addInstance","instanceNames":"localhost_1001"}' -H "Content-Type: application/json" http://localhost:8100/clusters/MyCluster/instances
      +
      +
        +
      • Swap an instance
      • +
      +
        curl -d 'jsonParameters={"command":"swapInstance","oldInstance":"localhost_1001", "newInstance":"localhost_1002"}' -H "Content-Type: application/json" http://localhost:8100/clusters/MyCluster/instances
      +
    • +
    • /clusters/{clusterName}/instances/{instanceName}

      +
        +
      • Show instance information
      • +
      +
        curl http://localhost:8100/clusters/MyCluster/instances/localhost_1001
      +
      +
        +
      • Enable/disable an instance
      • +
      +
        curl -d 'jsonParameters={"command":"enableInstance","enabled":"false"}' -H "Content-Type: application/json" http://localhost:8100/clusters/MyCluster/instances/localhost_1001
      +
      +
        +
      • Drop an instance
      • +
      +
        curl -X DELETE http://localhost:8100/clusters/MyCluster/instances/localhost_1001
      +
      +
        +
      • Disable/enable partitions on an instance
      • +
      +
        curl -d 'jsonParameters={"command":"enablePartition","resource": "MyDB","partition":"MyDB_0",  "enabled" : "false"}' -H "Content-Type: application/json" http://localhost:8100/clusters/MyCluster/instances/localhost_1001
      +
      +
        +
      • Reset an erroneous partition on an instance
      • +
      +
        curl -d 'jsonParameters={"command":"resetPartition","resource": "MyDB","partition":"MyDB_0"}' -H "Content-Type: application/json" http://localhost:8100/clusters/MyCluster/instances/localhost_1001
      +
      +
        +
      • Reset all erroneous partitions on an instance
      • +
      +
        curl -d 'jsonParameters={"command":"resetInstance"}' -H "Content-Type: application/json" http://localhost:8100/clusters/MyCluster/instances/localhost_1001
      +
    • +
    • /clusters/{clusterName}/configs

      +
        +
      • Get user cluster level config
      • +
      +
        curl http://localhost:8100/clusters/MyCluster/configs/cluster
      +
      +
        +
      • Set user cluster level config
      • +
      +
        curl -d 'jsonParameters={"command":"setConfig","configs":"key1=value1,key2=value2"}' -H "Content-Type: application/json" http://localhost:8100/clusters/MyCluster/configs/cluster
      +
      +
        +
      • Remove user cluster level config
      • +
      +
      curl -d 'jsonParameters={"command":"removeConfig","configs":"key1,key2"}' -H "Content-Type: application/json" http://localhost:8100/clusters/MyCluster/configs/cluster
      +
      +
        +
      • Get/set/remove user participant level config
      • +
      +
        curl -d 'jsonParameters={"command":"setConfig","configs":"key1=value1,key2=value2"}' -H "Content-Type: application/json" http://localhost:8100/clusters/MyCluster/configs/participant/localhost_1001
      +
      +
        +
      • Get/set/remove resource level config
      • +
      +
      curl -d 'jsonParameters={"command":"setConfig","configs":"key1=value1,key2=value2"}' -H "Content-Type: application/json" http://localhost:8100/clusters/MyCluster/configs/resource/MyDB
      +
    • +
    • /clusters/{clusterName}/controller

      +
        +
      • Show controller information
      • +
      +
        curl http://localhost:8100/clusters/MyCluster/Controller
      +
      +
        +
      • Enable/disable cluster
      • +
      +
        curl -d 'jsonParameters={"command":"enableCluster","enabled":"false"}' -H "Content-Type: application/json" http://localhost:8100/clusters/MyCluster/Controller
      +
    • +
    • /zkPath/{path}

      +
        +
      • Get information for zookeeper path
      • +
      +
        curl http://localhost:8100/zkPath/MyCluster
      +
    • +
    • /clusters/{clusterName}/StateModelDefs

      +
        +
      • Show all state model definitions
      • +
      +
        curl http://localhost:8100/clusters/MyCluster/StateModelDefs
      +
      +
        +
      • Add a state mdoel definition
      • +
      +
        echo jsonParameters={
      +    "command":"addStateModelDef"
      +   }&newStateModelDef={
      +      "id" : "OnlineOffline",
      +      "simpleFields" : {
      +        "INITIAL_STATE" : "OFFLINE"
      +      },
      +      "listFields" : {
      +        "STATE_PRIORITY_LIST" : [ "ONLINE", "OFFLINE", "DROPPED" ],
      +        "STATE_TRANSITION_PRIORITYLIST" : [ "OFFLINE-ONLINE", "ONLINE-OFFLINE", "OFFLINE-DROPPED" ]
      +      },
      +      "mapFields" : {
      +        "DROPPED.meta" : {
      +          "count" : "-1"
      +        },
      +        "OFFLINE.meta" : {
      +          "count" : "-1"
      +        },
      +        "OFFLINE.next" : {
      +          "DROPPED" : "DROPPED",
      +          "ONLINE" : "ONLINE"
      +        },
      +        "ONLINE.meta" : {
      +          "count" : "R"
      +        },
      +        "ONLINE.next" : {
      +          "DROPPED" : "OFFLINE",
      +          "OFFLINE" : "OFFLINE"
      +        }
      +      }
      +    }
      +    > newStateModelDef.json
      +    curl -d @'./untitled.txt' -H 'Content-Type: application/json' http://localhost:8100/clusters/MyCluster/StateModelDefs
      +
    • +
    • /clusters/{clusterName}/StateModelDefs/{stateModelDefName}

      +
        +
      • Show a state model definition
      • +
      +
        curl http://localhost:8100/clusters/MyCluster/StateModelDefs/OnlineOffline
      +
    • +
    • /clusters/{clusterName}/constraints/{constraintType}

      +
        +
      • Show all contraints
      • +
      +
        curl http://localhost:8100/clusters/MyCluster/constraints/MESSAGE_CONSTRAINT
      +
      +
        +
      • Set a contraint
      • +
      +
         curl -d 'jsonParameters={"constraintAttributes":"RESOURCE=MyDB,CONSTRAINT_VALUE=1"}' -H "Content-Type: application/json" http://localhost:8100/clusters/MyCluster/constraints/MESSAGE_CONSTRAINT/MyConstraint
      +
      +
        +
      • Remove a constraint
      • +
      +
        curl -X DELETE http://localhost:8100/clusters/MyCluster/constraints/MESSAGE_CONSTRAINT/MyConstraint
      +
    • +
    Modified: incubator/helix/site-content/tutorial_controller.html URL: http://svn.apache.org/viewvc/incubator/helix/site-content/tutorial_controller.html?rev=1519752&r1=1519751&r2=1519752&view=diff ============================================================================== --- incubator/helix/site-content/tutorial_controller.html (original) +++ incubator/helix/site-content/tutorial_controller.html Tue Sep 3 16:43:37 2013 @@ -1,13 +1,13 @@ - + Apache Helix - @@ -86,6 +86,9 @@
  • Distributed task DAG Execution
  • + +
  • User-Defined Rebalancer Example +
  • Last Published: 2013-08-28
  • +
  • Last Published: 2013-09-03
  • @@ -225,7 +228,7 @@ under the License. -->

    Helix Tutorial

    The snippet above shows how the controller is started. You can also start the controller using command line interface.

    cd helix/helix-core/target/helix-core-pkg/bin
     ./run-helix-controller.sh --zkSvr <Zookeeper ServerAddress (Required)>  --cluster <Cluster name (Required)>
    -

    Controller deployment modes

    Helix provides multiple options to deploy the controller.

    STANDALONE

    The Controller can be started as a separate process to manage a cluster. This is the recommended approach. However, since one controller can be a single point of failure, multiple controller processes are required for reliability. Even if multiple controllers are running, only one will be actively managing the cluster at any time and is decided by a leader-election process. If the leader fails, another leader will take over managing the cluster.

    Even though we recommend this method of deployment, it has the drawback of having to manage an additional service for each cluster. See Controller As a Service option.

    EMBEDDED

    If setting up a separate controller process is not viable, then it is possible to embed the controller as a library in each of the participants.

    CONTROLLER AS A SERVICE

    One of the cool feature we added in Helix was to use a set of controllers to manage a large number of clusters.

    For example if you have X clusters to be managed, instead of deploying X*3 (3 controllers for fault tolerance) controllers for each cluster, one can deploy just 3 controllers. Each controller can manage X/3 clusters. If any controller fails, the remaining two will manage X/2 clusters.

    +

    Controller deployment modes

    Helix provides multiple options to deploy the controller.

    STANDALONE

    The Controller can be started as a separate process to manage a cluster. This is the recommended approach. However, since one controller can be a single point of failure, multiple controller processes are required for reliability. Even if multiple controllers are running, only one will be actively managing the cluster at any time and is decided by a leader-election process. If the leader fails, another leader will take over managing the cluster.

    Even though we recommend this method of deployment, it has the drawback of having to manage an additional service for each cluster. See Controller As a Service option.

    EMBEDDED

    If setting up a separate controller process is not viable, then it is possible to embed the controller as a library in each of the participants.

    CONTROLLER AS A SERVICE

    One of the cool features we added in Helix is to use a set of controllers to manage a large number of clusters.

    For example if you have X clusters to be managed, instead of deploying X*3 (3 controllers for fault tolerance) controllers for each cluster, one can deploy just 3 controllers. Each controller can manage X/3 clusters. If any controller fails, the remaining two will manage X/2 clusters.

    Modified: incubator/helix/site-content/tutorial_health.html URL: http://svn.apache.org/viewvc/incubator/helix/site-content/tutorial_health.html?rev=1519752&r1=1519751&r2=1519752&view=diff ============================================================================== --- incubator/helix/site-content/tutorial_health.html (original) +++ incubator/helix/site-content/tutorial_health.html Tue Sep 3 16:43:37 2013 @@ -1,13 +1,13 @@ - + Apache Helix - @@ -86,6 +86,9 @@
  • Distributed task DAG Execution
  • + +
  • User-Defined Rebalancer Example +
  • Last Published: 2013-08-28
  • +
  • Last Published: 2013-09-03
  • Modified: incubator/helix/site-content/tutorial_messaging.html URL: http://svn.apache.org/viewvc/incubator/helix/site-content/tutorial_messaging.html?rev=1519752&r1=1519751&r2=1519752&view=diff ============================================================================== --- incubator/helix/site-content/tutorial_messaging.html (original) +++ incubator/helix/site-content/tutorial_messaging.html Tue Sep 3 16:43:37 2013 @@ -1,13 +1,13 @@ - + Apache Helix - @@ -86,6 +86,9 @@
  • Distributed task DAG Execution
  • + +
  • User-Defined Rebalancer Example +
  • Last Published: 2013-08-28
  • +
  • Last Published: 2013-09-03
  • @@ -193,7 +196,7 @@ software distributed under the License i "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations -under the License. -->

    Helix Tutorial: Messaging

    In this chapter, we'll learn about messaging, a convenient feature in Helix for sending messages between nodes of a cluster. This is an interesting feature which is quite useful in practice. It is common that nodes in a distributed system require a mechanism to interact with each other.

    Example: Bootstrapping a Replica

    Consider a search system where the index replica starts up and it does not have an index. A typical solution is to get the index from a common location, or to copy the index from another replica.

    Helix provides a messaging api for intra-cluster communication between nodes in the system. Helix provides a mechanism to specify the message recipient in terms of resource, partition, and state rather than specifying hostnames. Helix ensures that the message is delivered to all of the required recipients. In this particular use case, the instance can specify the recipient criteria as all replicas of the desired partition to bootstrap. Since Helix is aware of the global state of the system, it can send the message to appropriate nodes. Once the nodes respond, Helix provides the bootstrapping replica with all the responses.

    This is a very generic api and can also be used to schedule various periodic tasks in the cluster, such as data backups, log cleanup, etc. System Admins can also perform ad-hoc tasks, such as on-demand backups or a system command (such as rm -rf ;) across all nodes of the cluster

    +under the License. -->

    Helix Tutorial: Messaging

    In this chapter, we'll learn about messaging, a convenient feature in Helix for sending messages between nodes of a cluster. This is an interesting feature which is quite useful in practice. It is common that nodes in a distributed system require a mechanism to interact with each other.

    Example: Bootstrapping a Replica

    Consider a search system where the index replica starts up and it does not have an index. A typical solution is to get the index from a common location, or to copy the index from another replica.

    Helix provides a messaging API for intra-cluster communication between nodes in the system. Helix provides a mechanism to specify the message recipient in terms of resource, partition, and state rather than specifying hostnames. Helix ensures that the message is delivered to all of the required recipients. In this particular use case, the instance can specify the recipient criteria as all replicas of the desired partition to bootstrap. Since Helix is aware of the global state of the system, it can send the message to appropriate nodes. Once the nodes respond, Helix provides the bootstrapping replica with all the responses.

    This is a very generic API and can also be used to schedule various periodic tasks in the cluster, such as data backups, log cleanup, etc. System Admins can also perform ad-hoc tasks, such as on-demand backups or a system command (such as rm -rf ;) across all nodes of the cluster

          ClusterMessagingService messagingService = manager.getMessagingService();
     
           // Construct the Message
    
    Modified: incubator/helix/site-content/tutorial_participant.html
    URL: http://svn.apache.org/viewvc/incubator/helix/site-content/tutorial_participant.html?rev=1519752&r1=1519751&r2=1519752&view=diff
    ==============================================================================
    --- incubator/helix/site-content/tutorial_participant.html (original)
    +++ incubator/helix/site-content/tutorial_participant.html Tue Sep  3 16:43:37 2013
    @@ -1,13 +1,13 @@
     
     
     
       
         
         
    -    
    +    
         
         Apache Helix - 
         
    @@ -86,6 +86,9 @@
                       
                           
  • Distributed task DAG Execution
  • + +
  • User-Defined Rebalancer Example +
  • Last Published: 2013-08-28
  • +
  • Last Published: 2013-09-03
  • @@ -193,7 +196,7 @@ software distributed under the License i "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations -under the License. -->

    Helix Tutorial: Participant

    In this chapter, we'll learn how to implement a PARTICIPANT, which is a primary functional component of a distributed system.

    Start the Helix agent

    The Helix agent is a common component that connects each system component with the controller.

    It requires the following parameters:

    +under the License. -->

    Helix Tutorial: Participant

    In this chapter, we'll learn how to implement a Participant, which is a primary functional component of a distributed system.

    Start the Helix agent

    The Helix agent is a common component that connects each system component with the controller.

    It requires the following parameters:

    • clusterName: A logical name to represent the group of nodes
    • instanceName: A logical name of the process creating the manager instance. Generally this is host:port.
    • @@ -210,6 +213,7 @@ under the License. -->

      Helix Tutorial
    • MasterSlaveStateModelFactory
    • LeaderStandbyStateModelFactory
    • BootstrapHandler
    • +
    • An application defined state model factory
          manager = HelixManagerFactory.getZKHelixManager(clusterName,
                                                               instanceName,
    @@ -221,7 +225,7 @@ under the License. -->

    Helix Tutorial stateModelFactory = new OnlineOfflineStateModelFactory(); stateMach.registerStateModelFactory(stateModelType, stateModelFactory); manager.connect(); -

    Helix doesn't know what it means to change from OFFLIN-->ONLINE or ONLINE-->OFFLINE. The following code snippet shows where you insert your system logic for these two state transitions.

    +

    Helix doesn't know what it means to change from OFFLINE-->ONLINE or ONLINE-->OFFLINE. The following code snippet shows where you insert your system logic for these two state transitions.

    public class OnlineOfflineStateModelFactory extends
             StateModelFactory<StateModel> {
         @Override
    
    Modified: incubator/helix/site-content/tutorial_propstore.html
    URL: http://svn.apache.org/viewvc/incubator/helix/site-content/tutorial_propstore.html?rev=1519752&r1=1519751&r2=1519752&view=diff
    ==============================================================================
    --- incubator/helix/site-content/tutorial_propstore.html (original)
    +++ incubator/helix/site-content/tutorial_propstore.html Tue Sep  3 16:43:37 2013
    @@ -1,13 +1,13 @@
     
     
     
       
         
         
    -    
    +    
         
         Apache Helix - 
         
    @@ -86,6 +86,9 @@
                       
                           
  • Distributed task DAG Execution
  • + +
  • User-Defined Rebalancer Example +
  • Last Published: 2013-08-28
  • +
  • Last Published: 2013-09-03
  • Modified: incubator/helix/site-content/tutorial_rebalance.html URL: http://svn.apache.org/viewvc/incubator/helix/site-content/tutorial_rebalance.html?rev=1519752&r1=1519751&r2=1519752&view=diff ============================================================================== --- incubator/helix/site-content/tutorial_rebalance.html (original) +++ incubator/helix/site-content/tutorial_rebalance.html Tue Sep 3 16:43:37 2013 @@ -1,13 +1,13 @@ - + Apache Helix - @@ -86,6 +86,9 @@
  • Distributed task DAG Execution
  • + +
  • User-Defined Rebalancer Example +
  • Last Published: 2013-08-28
  • +
  • Last Published: 2013-09-03
  • @@ -193,29 +196,30 @@ software distributed under the License i "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations -under the License. -->

    Helix Tutorial: Rebalancing Algorithms

    The placement of partitions in a distributed system is essential for the reliability and scalability of the system. For example, when a node fails, it is important that the partitions hosted on that node are reallocated evenly among the remaining nodes. Consistent hashing is one such algorithm that can satisfy this guarantee. Helix provides a variant of consistent hashing based on the RUSH algorithm.

    This means given a number of partitions, replicas and number of nodes, Helix does the automatic assignment of partition to nodes such that:

    +under the License. -->

    Helix Tutorial: Rebalancing Algorithms

    The placement of partitions in a distributed system is essential for the reliability and scalability of the system. For example, when a node fails, it is important that the partitions hosted on that node are reallocated evenly among the remaining nodes. Consistent hashing is one such algorithm that can satisfy this guarantee. Helix provides a variant of consistent hashing based on the RUSH algorithm, among others.

    This means given a number of partitions, replicas and number of nodes, Helix does the automatic assignment of partition to nodes such that:

    • Each node has the same number of partitions
    • Replicas of the same partition do not stay on the same node
    • When a node fails, the partitions will be equally distributed among the remaining nodes
    • When new nodes are added, the number of partitions moved will be minimized along with satisfying the above criteria
    • -

    Helix employs a rebalancing algorithm to compute the ideal state of the system. When the current state differs from the ideal state, Helix uses it as the target state of the system and computes the appropriate transitions needed to bring it to the ideal state.

    Helix makes it easy to perform this operation, while giving you control over the algorithm. In this section, we'll see how to implement the desired behavior.

    Helix has three options for rebalancing, in increasing order of customization by the system builder:

    +

    Helix employs a rebalancing algorithm to compute the ideal state of the system. When the current state differs from the ideal state, Helix uses it as the target state of the system and computes the appropriate transitions needed to bring it to the ideal state.

    Helix makes it easy to perform this operation, while giving you control over the algorithm. In this section, we'll see how to implement the desired behavior.

    Helix has four options for rebalancing, in increasing order of customization by the system builder:

      -
    • AUTO_REBALANCE
    • -
    • AUTO
    • -
    • CUSTOM
    • +
    • FULL_AUTO
    • +
    • SEMI_AUTO
    • +
    • CUSTOMIZED
    • +
    • USER_DEFINED
    -
                |AUTO REBALANCE|   AUTO     |   CUSTOM  |       
    -            -----------------------------------------
    -   LOCATION | HELIX        |  APP       |  APP      |
    -            -----------------------------------------
    -      STATE | HELIX        |  HELIX     |  APP      |
    -            -----------------------------------------
    -

    AUTO_REBALANCE

    When the idealstate mode is set to AUTO_REBALANCE, Helix controls both the location of the replica along with the state. This option is useful for applications where creation of a replica is not expensive.

    For example, consider this system that uses a MasterSlave state model, with 3 partitions and 2 replicas in the ideal state.

    +
                |FULL_AUTO     |  SEMI_AUTO | CUSTOMIZED|  USER_DEFINED  |
    +            ---------------------------------------------------------|
    +   LOCATION | HELIX        |  APP       |  APP      |      APP       |
    +            ---------------------------------------------------------|
    +      STATE | HELIX        |  HELIX     |  APP      |      APP       |
    +            ----------------------------------------------------------
    +

    FULL_AUTO

    When the rebalance mode is set to FULL_AUTO, Helix controls both the location of the replica along with the state. This option is useful for applications where creation of a replica is not expensive.

    For example, consider this system that uses a MasterSlave state model, with 3 partitions and 2 replicas in the ideal state.

    {
       "id" : "MyResource",
       "simpleFields" : {
    -    "IDEAL_STATE_MODE" : "AUTO_REBALANCE",
    +    "REBALANCE_MODE" : "FULL_AUTO",
         "NUM_PARTITIONS" : "3",
         "REPLICAS" : "2",
         "STATE_MODEL_DEF_REF" : "MasterSlave",
    @@ -251,11 +255,11 @@ under the License. -->

    Helix Tutorial } } } -

    Another typical example is evenly distributing a group of tasks among the currently healthy processes. For example, if there are 60 tasks and 4 nodes, Helix assigns 15 tasks to each node. When one node fails, Helix redistributes its 15 tasks to the remaining 3 nodes, resulting in a balanced 20 tasks per node. Similarly, if a node is added, Helix re-allocates 3 tasks from each of the 4 nodes to the 5th node, resulting in a balanced distribution of 12 tasks per node..

    AUTO

    When the application needs to control the placement of the replicas, use the AUTO idealstate mode.

    Example: In the ideal state below, the partition 'MyResource_0' is constrained to be placed only on node1 or node2. The choice of state is still controlled by Helix. That means MyResource_0.MASTER could be on node1 and MyResource_0.SLAVE on node2, or vice-versa but neither would be placed on node3.

    +

    Another typical example is evenly distributing a group of tasks among the currently healthy processes. For example, if there are 60 tasks and 4 nodes, Helix assigns 15 tasks to each node. When one node fails, Helix redistributes its 15 tasks to the remaining 3 nodes, resulting in a balanced 20 tasks per node. Similarly, if a node is added, Helix re-allocates 3 tasks from each of the 4 nodes to the 5th node, resulting in a balanced distribution of 12 tasks per node..

    SEMI_AUTO

    When the application needs to control the placement of the replicas, use the SEMI_AUTO rebalance mode.

    Example: In the ideal state below, the partition 'MyResource_0' is constrained to be placed only on node1 or node2. The choice of state is still controlled by Helix. That means MyResource_0.MASTER could be on node1 and MyResource_0.SLAVE on node2, or vice-versa but neither would be placed on node3.

    {
       "id" : "MyResource",
       "simpleFields" : {
    -    "IDEAL_STATE_MODE" : "AUTO",
    +    "REBALANCE_MODE" : "SEMI_AUTO",
         "NUM_PARTITIONS" : "3",
         "REPLICAS" : "2",
         "STATE_MODEL_DEF_REF" : "MasterSlave",
    @@ -268,11 +272,11 @@ under the License. -->

    Helix Tutorial "mapFields" : { } } -

    The MasterSlave state model requires that a partition has exactly one MASTER at all times, and the other replicas should be SLAVEs. In this simple example with 2 replicas per partition, there would be one MASTER and one SLAVE. Upon failover, a SLAVE has to assume mastership, and a new SLAVE will be generated.

    In this mode when node1 fails, unlike in AUTO-REBALANCE mode the partition is not moved from node1 to node3. Instead, Helix will decide to change the state of MyResource_0 on node2 from SLAVE to MASTER, based on the system constraints.

    CUSTOM

    Finally, Helix offers a third mode called CUSTOM, in which the application controls the placement and state of each replica. The application needs to implement a callback interface that Helix invokes when the cluster state changes. Within this callback, the application can recompute the idealstate. Helix will then issue appropriate transitions such that Idealstate and Currentstate converges.

    Here's an example, again with 3 partitions, 2 replicas per partition, and the MasterSlave state model:

    +

    The MasterSlave state model requires that a partition has exactly one MASTER at all times, and the other replicas should be SLAVEs. In this simple example with 2 replicas per partition, there would be one MASTER and one SLAVE. Upon failover, a SLAVE has to assume mastership, and a new SLAVE will be generated.

    In this mode when node1 fails, unlike in FULL_AUTO mode the partition is not moved from node1 to node3. Instead, Helix will decide to change the state of MyResource_0 on node2 from SLAVE to MASTER, based on the system constraints.

    CUSTOMIZED

    Helix offers a third mode called CUSTOMIZED, in which the application controls the placement and state of each replica. The application needs to implement a callback interface that Helix invokes when the cluster state changes. Within this callback, the application can recompute the idealstate. Helix will then issue appropriate transitions s uch that Idealstate and Currentstate converges.

    Here's an example, again with 3 partitions, 2 replicas per partition, and the MasterSlave state model:

    {
       "id" : "MyResource",
       "simpleFields" : {
    -      "IDEAL_STATE_MODE" : "CUSTOM",
    +    "REBALANCE_MODE" : "CUSTOMIZED",
         "NUM_PARTITIONS" : "3",
         "REPLICAS" : "2",
         "STATE_MODEL_DEF_REF" : "MasterSlave",
    @@ -292,7 +296,7 @@ under the License. -->

    Helix Tutorial } } } -

    Suppose the current state of the system is MyResource_0 -> {N1:MASTER, N2:SLAVE} and the application changes the ideal state to MyResource_0 -> {N1:SLAVE,N2:MASTER}. While the application decides which node is MASTER and which is SLAVE, Helix will not blindly issue MASTER>SLAVE to N1 and SLAVE>MASTER to N2 in parallel, since that might result in a transient state where both N1 and N2 are masters, which violates the MasterSlave constraint that there is exactly one MASTER at a time. Helix will first issue MASTER>SLAVE to N1 and after it is completed, it will issue SLAVE>MASTER to N2.

    +

    Suppose the current state of the system is MyResource_0 -> {N1:MASTER, N2:SLAVE} and the application changes the ideal state to MyResource_0 -> {N1:SLAVE,N2:MASTER}. While the application decides which node is MASTER and which is SLAVE, Helix will not blindly issue MASTER>SLAVE to N1 and SLAVE>MASTER to N2 in parallel, since that might result in a transient state where both N1 and N2 are masters, which violates the MasterSlave constraint that there is exactly one MASTER at a time. Helix will first issue MASTER>SLAVE to N1 and after it is completed, it will issue SLAVE>MASTER to N2.

    USER_DEFINED

    For maximum flexibility, Helix exposes an interface that can allow applications to plug in custom rebalancing logic. By providing the name of a class that implements the Rebalancer interface, Helix will automatically call the contained method whenever there is a change to the live participant s in the cluster. For more, see User-Defined Rebalancer.

    Backwards Compatibility

    In previous versions, FULL_AUTO was called AUTO_REBALANCE and SEMI_AUTO was called AUTO. Furthermore, they were presented as the IDEAL_STATE_MODE. Helix supports both IDEAL_STATE_MODE and REBALANCE_MODE, but IDEAL_STATE_MODE is now deprecated and may be phased out in future versions.

    Modified: incubator/helix/site-content/tutorial_spectator.html URL: http://svn.apache.org/viewvc/incubator/helix/site-content/tutorial_spectator.html?rev=1519752&r1=1519751&r2=1519752&view=diff ============================================================================== --- incubator/helix/site-content/tutorial_spectator.html (original) +++ incubator/helix/site-content/tutorial_spectator.html Tue Sep 3 16:43:37 2013 @@ -1,13 +1,13 @@ - + Apache Helix - @@ -86,6 +86,9 @@
  • Distributed task DAG Execution
  • + +
  • User-Defined Rebalancer Example +
  • Last Published: 2013-08-28
  • +
  • Last Published: 2013-09-03
  • @@ -193,7 +196,7 @@ software distributed under the License i "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations -under the License. -->

    Helix Tutorial: Spectator

    Next, we'll learn how to implement a SPECTATOR. Typically, a spectator needs to react to changes within the distributed system. Examples: a client that needs to know where to send a request, a topic consumer in a consumer group. The spectator is automatically informed of changes in the external state of the cluster, but it does not have to add any code to keep track of other components in the system.

    Start the Helix agent

    Same as for a PARTICIPANT, The Helix agent is the common component that connects each system component with the controller.

    It requires the following parameters:

    +under the License. -->

    Helix Tutorial: Spectator

    Next, we'll learn how to implement a Spectator. Typically, a spectator needs to react to changes within the distributed system. Examples: a client that needs to know where to send a request, a topic consumer in a consumer group. The spectator is automatically informed of changes in the external state of the cluster, but it does not have to add any code to keep track of other components in the system.

    Start the Helix agent

    Same as for a Participant, The Helix agent is the common component that connects each system component with the controller.

    It requires the following parameters:

    • clusterName: A logical name to represent the group of nodes
    • instanceName: A logical name of the process creating the manager instance. Generally this is host:port.
    • Modified: incubator/helix/site-content/tutorial_state.html URL: http://svn.apache.org/viewvc/incubator/helix/site-content/tutorial_state.html?rev=1519752&r1=1519751&r2=1519752&view=diff ============================================================================== --- incubator/helix/site-content/tutorial_state.html (original) +++ incubator/helix/site-content/tutorial_state.html Tue Sep 3 16:43:37 2013 @@ -1,13 +1,13 @@ - + Apache Helix - @@ -86,6 +86,9 @@
    • Distributed task DAG Execution
    • + +
    • User-Defined Rebalancer Example +
  • Last Published: 2013-08-28
  • +
  • Last Published: 2013-09-03
  • Modified: incubator/helix/site-content/tutorial_throttling.html URL: http://svn.apache.org/viewvc/incubator/helix/site-content/tutorial_throttling.html?rev=1519752&r1=1519751&r2=1519752&view=diff ============================================================================== --- incubator/helix/site-content/tutorial_throttling.html (original) +++ incubator/helix/site-content/tutorial_throttling.html Tue Sep 3 16:43:37 2013 @@ -1,13 +1,13 @@ - + Apache Helix - @@ -86,6 +86,9 @@
  • Distributed task DAG Execution
  • + +
  • User-Defined Rebalancer Example +
  • Last Published: 2013-08-28
  • +
  • Last Published: 2013-09-03