helix-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kisho...@apache.org
Subject git commit: [HELIX-15] distributed lock manager recipe
Date Mon, 17 Dec 2012 16:11:14 GMT
Updated Branches:
  refs/heads/master 9519b4740 -> 7b6790dcb


[HELIX-15] distributed lock manager recipe


Project: http://git-wip-us.apache.org/repos/asf/incubator-helix/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-helix/commit/7b6790dc
Tree: http://git-wip-us.apache.org/repos/asf/incubator-helix/tree/7b6790dc
Diff: http://git-wip-us.apache.org/repos/asf/incubator-helix/diff/7b6790dc

Branch: refs/heads/master
Commit: 7b6790dcbc9b6c8c263ae8b06a2ca8a403fb1c2f
Parents: 9519b47
Author: Kishore Gopalakrishna <g.kishore@gmail.com>
Authored: Mon Dec 17 08:10:52 2012 -0800
Committer: Kishore Gopalakrishna <g.kishore@gmail.com>
Committed: Mon Dec 17 08:10:52 2012 -0800

----------------------------------------------------------------------
 recipes/distributed-lock-manager/README.md         |  180 ++++++++++++++
 recipes/distributed-lock-manager/pom.xml           |  123 ++++++++++
 .../src/main/config/log4j.properties               |   31 +++
 .../java/org/apache/helix/lockmanager/Lock.java    |   50 ++++
 .../org/apache/helix/lockmanager/LockFactory.java  |   31 +++
 .../apache/helix/lockmanager/LockManagerDemo.java  |  184 +++++++++++++++
 .../org/apache/helix/lockmanager/LockProcess.java  |  128 ++++++++++
 recipes/pom.xml                                    |    1 +
 recipes/rsync-replicated-file-system/README.md     |   93 ++++++--
 src/site/markdown/recipes/lock_manager.md          |  180 ++++++++++++++
 src/site/resources/images/helix-logo.jpg           |  Bin 0 -> 13659 bytes
 11 files changed, 986 insertions(+), 15 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/7b6790dc/recipes/distributed-lock-manager/README.md
----------------------------------------------------------------------
diff --git a/recipes/distributed-lock-manager/README.md b/recipes/distributed-lock-manager/README.md
new file mode 100644
index 0000000..5ef55d9
--- /dev/null
+++ b/recipes/distributed-lock-manager/README.md
@@ -0,0 +1,180 @@
+<!---
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+Distributed lock manager
+------------------------
+Distributed locks are used to synchronize accesses shared resources. Most applications use Zookeeper to model the distributed locks. 
+
+The simplest way to model a lock using zookeeper is (See Zookeeper leader recipe for an exact and more advanced solution)
+
+* Each process tries to create an emphemeral node.
+* If can successfully create it then, it acquires the lock
+* Else it will watch on the znode and try to acquire the lock again.
+
+This is good enough if there is only one lock. But in practice, an application will have many such locks. Distributing and managing the locks among difference process becomes challenging. Extending such a solution to many locks will result in
+
+* Uneven distribution of locks among nodes, the node that starts first will acquire all the lock. Nodes that start later will be idle.
+* When a node fails, how the locks will be distributed among remaining nodes is not predicable. 
+* When new nodes are added the current nodes dont relinquish the locks so that new nodes can acquire some locks
+
+In other words we want a system to satisfy the following requirements.
+
+* Distribute locks evenly among all nodes to get better hardware utilization
+* If a node fails, the locks that were acquired by that node should be evenly distributed among other nodes
+* If nodes are added, locks must be evenly re-distributed among nodes.
+
+Helix provides a simple and elegant solution to this problem. Simply specify the number of locks and Helix will ensure that above constraints are satisfied. 
+
+To quickly see this working run the lock-manager-demo script where 12 locks are evenly distributed among three nodes, and when a node fails, the locks get re-distributed among remaining two nodes. Note that Helix does not re-shuffle the locks completely, instead it simply distributes the locks relinquished by dead node among 2 remaining nodes evenly.
+
+#### Quick version
+ This version starts multiple threads with in same process to simulate a multi node deployment. Try the long version to get a better idea of how it works.
+ 
+```
+git clone https://git-wip-us.apache.org/repos/asf/incubator-helix.git
+cd incubator-helix
+mvn clean install package -DskipTests
+cd recipes/distributed-lock-manager/target/distributed-lock-manager-pkg/bin
+chmod +x *
+./lock-manager-demo
+
+``` 
+
+#### Long version
+This provides more details on how to setup the cluster and where to plugin application code.
+
+#### start zookeeper
+
+```
+./start-standalone-zookeeper 2199
+```
+
+#### Create a cluster
+
+```
+./helix-admin --zkSvr localhost:2199 --addCluster lock-manager-demo
+```
+
+#### Create a lock group
+
+Create a lock group and specify the number of locks in the lock group. You can change add new locks dynamically later. 
+```
+./helix-admin --zkSvr localhost:2199  --addResource lock-manager-demo lock-group 6 OnlineOffline AUTO_REBALANCE
+```
+
+#### Start the nodes
+
+Create a Lock class that handles the callbacks. 
+
+```
+public class Lock extends StateModel
+{
+  private String lockName;
+
+  public Lock(String lockName)
+  {
+    this.lockName = lockName;
+  }
+
+  public void lock(Message m, NotificationContext context)
+  {
+    System.out.println(" acquired lock:"+ lockName );
+  }
+
+  public void release(Message m, NotificationContext context)
+  {
+    System.out.println(" releasing lock:"+ lockName );
+  }
+
+}
+
+```
+
+LockFactory that creates the lock 
+```
+public class LockFactory extends StateModelFactory<Lock>{
+	/* Instantiates the lock handler, one per lockName*/
+	
+    public Lock create(String lockName)
+    {
+    	return new Lock(lockName);
+    }   
+}
+```
+Thats it, now when the node starts simply join the cluster and helix will invoke the appropriate call backs on Lock.
+
+```
+public class MyClass{
+
+	public static void main(String args){
+	    String zkAddress= "localhost:2199";
+	    String clusterName = "lock-manager-demo";
+	    //Give a unique id to each process, most commonly used format hostname_port
+	    String instanceName ="localhost_12000";
+	    ZKHelixAdmin helixAdmin = new ZKHelixAdmin(zkAddress);
+	    //configure the instance and provide some metadata 
+	    InstanceConfig config = new InstanceConfig(instanceName);
+	    config.setHostName("localhost");
+	    config.setPort("12000");
+	    admin.addInstance(clusterName, config);
+	    //join the cluster
+		HelixManager manager = HelixManager.getHelixManager(clusterName,null,InstanceType.PARTICIPANT,zkAddress);
+		manager.getStateMachineEngine().registerStateModelFactory("OnlineOffline", modelFactory);
+		manager.connect();
+		Thread.currentThread.join();
+	}
+
+}
+```
+
+#### Start the controller
+
+Controller can be started either as a separate process or can be embedded within each node process
+
+##### Separate process
+This is recommended when number of nodes in the cluster >100. For fault tolerance, you can run multiple controllers on different boxes.
+
+```
+./run-helix-controller --zkSvr localhost:2199 --cluster mycluster 2>&1 > /tmp/controller.log &
+```
+
+##### Embedded within the node process
+This is recommended when the number of nodes in the cluster is less than 100. To start a controller from each process, simply add the following lines to MyClass
+
+```
+public class MyClass{
+
+	public static void main(String args){
+	    String zkAddress= "localhost:2199";
+	    String clusterName = "lock-manager-demo";
+	    .
+	    .
+	    manager.connect();
+	    final HelixManager controller = HelixControllerMain.startHelixController(zkAddress, clusterName, "controller", HelixControllerMain.STANDALONE);
+	    Thread.currentThread.join();
+	}
+}
+```
+
+
+
+
+
+
+
+

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/7b6790dc/recipes/distributed-lock-manager/pom.xml
----------------------------------------------------------------------
diff --git a/recipes/distributed-lock-manager/pom.xml b/recipes/distributed-lock-manager/pom.xml
new file mode 100644
index 0000000..9883f1c
--- /dev/null
+++ b/recipes/distributed-lock-manager/pom.xml
@@ -0,0 +1,123 @@
+<?xml version="1.0" encoding="UTF-8" ?>
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
+  <modelVersion>4.0.0</modelVersion>
+
+  <parent>
+    <groupId>org.apache.helix.recipes</groupId>
+    <artifactId>recipes</artifactId>
+    <version>0.7-incubating-SNAPSHOT</version>
+  </parent>
+
+  <artifactId>distributed-lock-manager</artifactId>
+  <packaging>jar</packaging>
+  <version>1.0-SNAPSHOT</version>
+  <name>Apache Helix :: Recipes :: distributed lock manager</name>
+
+  <dependencies>
+    <dependency>
+      <groupId>org.apache.helix</groupId>
+      <artifactId>helix-core</artifactId>
+      <version>0.7-incubating-SNAPSHOT</version>
+    </dependency>
+    <dependency>
+      <groupId>log4j</groupId>
+      <artifactId>log4j</artifactId>
+      <exclusions>
+        <exclusion>
+          <groupId>javax.mail</groupId>
+          <artifactId>mail</artifactId>
+        </exclusion>
+        <exclusion>
+          <groupId>javax.jms</groupId>
+          <artifactId>jms</artifactId>
+        </exclusion>
+        <exclusion>
+          <groupId>com.sun.jdmk</groupId>
+          <artifactId>jmxtools</artifactId>
+        </exclusion>
+        <exclusion>
+          <groupId>com.sun.jmx</groupId>
+          <artifactId>jmxri</artifactId>
+        </exclusion>
+      </exclusions>
+    </dependency>
+  </dependencies>
+  <build>
+    <pluginManagement>
+      <plugins>
+        <plugin>
+          <groupId>org.codehaus.mojo</groupId>
+          <artifactId>appassembler-maven-plugin</artifactId>
+          <configuration>
+            <!-- Set the target configuration directory to be used in the bin scripts -->
+            <!-- <configurationDirectory>conf</configurationDirectory> -->
+            <!-- Copy the contents from "/src/main/config" to the target configuration
+              directory in the assembled application -->
+            <!-- <copyConfigurationDirectory>true</copyConfigurationDirectory> -->
+            <!-- Include the target configuration directory in the beginning of
+              the classpath declaration in the bin scripts -->
+            <includeConfigurationDirectoryInClasspath>true</includeConfigurationDirectoryInClasspath>
+            <assembleDirectory>${project.build.directory}/${project.artifactId}-pkg</assembleDirectory>
+            <!-- Extra JVM arguments that will be included in the bin scripts -->
+            <extraJvmArguments>-Xms512m -Xmx512m</extraJvmArguments>
+            <!-- Generate bin scripts for windows and unix pr default -->
+            <platforms>
+              <platform>windows</platform>
+              <platform>unix</platform>
+            </platforms>
+          </configuration>
+          <executions>
+            <execution>
+              <phase>package</phase>
+              <goals>
+                <goal>assemble</goal>
+              </goals>
+            </execution>
+          </executions>
+        </plugin>
+        <plugin>
+          <groupId>org.apache.rat</groupId>
+          <artifactId>apache-rat-plugin</artifactId>
+          <version>0.8</version>
+            <configuration>
+              <excludes combine.children="append">
+              </excludes>
+            </configuration>
+        </plugin>
+      </plugins>
+    </pluginManagement>
+    <plugins>
+      <plugin>
+        <groupId>org.codehaus.mojo</groupId>
+        <artifactId>appassembler-maven-plugin</artifactId>
+        <configuration>
+          <programs>
+            <program>
+              <mainClass>org.apache.helix.lockmanager.LockManagerDemo</mainClass>
+              <name>lock-manager-demo</name>
+            </program>
+          </programs>
+        </configuration>
+      </plugin>
+    </plugins>
+  </build>
+</project>

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/7b6790dc/recipes/distributed-lock-manager/src/main/config/log4j.properties
----------------------------------------------------------------------
diff --git a/recipes/distributed-lock-manager/src/main/config/log4j.properties b/recipes/distributed-lock-manager/src/main/config/log4j.properties
new file mode 100644
index 0000000..4b3dc31
--- /dev/null
+++ b/recipes/distributed-lock-manager/src/main/config/log4j.properties
@@ -0,0 +1,31 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+# Set root logger level to DEBUG and its only appender to A1.
+log4j.rootLogger=ERROR,A1
+
+# A1 is set to be a ConsoleAppender.
+log4j.appender.A1=org.apache.log4j.ConsoleAppender
+
+# A1 uses PatternLayout.
+log4j.appender.A1.layout=org.apache.log4j.PatternLayout
+log4j.appender.A1.layout.ConversionPattern=%-4r [%t] %-5p %c %x - %m%n
+
+log4j.logger.org.I0Itec=ERROR
+log4j.logger.org.apache=ERROR

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/7b6790dc/recipes/distributed-lock-manager/src/main/java/org/apache/helix/lockmanager/Lock.java
----------------------------------------------------------------------
diff --git a/recipes/distributed-lock-manager/src/main/java/org/apache/helix/lockmanager/Lock.java b/recipes/distributed-lock-manager/src/main/java/org/apache/helix/lockmanager/Lock.java
new file mode 100644
index 0000000..acee798
--- /dev/null
+++ b/recipes/distributed-lock-manager/src/main/java/org/apache/helix/lockmanager/Lock.java
@@ -0,0 +1,50 @@
+package org.apache.helix.lockmanager;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import org.apache.helix.NotificationContext;
+import org.apache.helix.model.Message;
+import org.apache.helix.participant.statemachine.StateModel;
+import org.apache.helix.participant.statemachine.StateModelInfo;
+import org.apache.helix.participant.statemachine.Transition;
+
+@StateModelInfo(initialState = "OFFLINE", states = { "OFFLINE", "ONLINE" })
+public class Lock extends StateModel
+{
+  private String lockName;
+
+  public Lock(String lockName)
+  {
+    this.lockName = lockName;
+  }
+
+  @Transition(from = "OFFLINE", to = "ONLINE")
+  public void lock(Message m, NotificationContext context)
+  {
+    System.out.println(context.getManager().getInstanceName() + " acquired lock:"+ lockName );
+  }
+
+  @Transition(from = "ONLINE", to = "OFFLINE")
+  public void release(Message m, NotificationContext context)
+  {
+    System.out.println(context.getManager().getInstanceName() + " releasing lock:"+ lockName );
+  }
+
+}

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/7b6790dc/recipes/distributed-lock-manager/src/main/java/org/apache/helix/lockmanager/LockFactory.java
----------------------------------------------------------------------
diff --git a/recipes/distributed-lock-manager/src/main/java/org/apache/helix/lockmanager/LockFactory.java b/recipes/distributed-lock-manager/src/main/java/org/apache/helix/lockmanager/LockFactory.java
new file mode 100644
index 0000000..9c0d3e8
--- /dev/null
+++ b/recipes/distributed-lock-manager/src/main/java/org/apache/helix/lockmanager/LockFactory.java
@@ -0,0 +1,31 @@
+package org.apache.helix.lockmanager;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import org.apache.helix.participant.statemachine.StateModelFactory;
+
+public class LockFactory extends StateModelFactory<Lock>
+{
+  @Override
+  public Lock createNewStateModel(String lockName)
+  {
+    return new Lock(lockName);
+  }
+}

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/7b6790dc/recipes/distributed-lock-manager/src/main/java/org/apache/helix/lockmanager/LockManagerDemo.java
----------------------------------------------------------------------
diff --git a/recipes/distributed-lock-manager/src/main/java/org/apache/helix/lockmanager/LockManagerDemo.java b/recipes/distributed-lock-manager/src/main/java/org/apache/helix/lockmanager/LockManagerDemo.java
new file mode 100644
index 0000000..11cfdaf
--- /dev/null
+++ b/recipes/distributed-lock-manager/src/main/java/org/apache/helix/lockmanager/LockManagerDemo.java
@@ -0,0 +1,184 @@
+package org.apache.helix.lockmanager;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.io.File;
+import java.util.Map;
+import java.util.TreeSet;
+
+import org.I0Itec.zkclient.IDefaultNameSpace;
+import org.I0Itec.zkclient.ZkClient;
+import org.I0Itec.zkclient.ZkServer;
+import org.apache.commons.io.FileUtils;
+import org.apache.helix.HelixAdmin;
+import org.apache.helix.HelixManager;
+import org.apache.helix.controller.HelixControllerMain;
+import org.apache.helix.manager.zk.ZKHelixAdmin;
+import org.apache.helix.model.ExternalView;
+import org.apache.helix.model.StateModelDefinition;
+import org.apache.helix.model.IdealState.IdealStateModeProperty;
+import org.apache.helix.tools.StateModelConfigGenerator;
+
+public class LockManagerDemo
+{
+  /**
+   * LockManagerDemo clusterName, numInstances, lockGroupName, numLocks
+   * 
+   * @param args
+   * @throws Exception
+   */
+  public static void main(String[] args) throws Exception
+  {
+    final String zkAddress = "localhost:2199";
+    final String clusterName = "lock-manager-demo";
+    final String lockGroupName = "lock-group";
+    final int numInstances = 3;
+    final int numPartitions = 12;
+    final boolean startController = false;
+    HelixManager controllerManager = null;
+    Thread[] processArray;
+    processArray = new Thread[numInstances];
+    try
+    {
+      startLocalZookeeper(2199);
+      HelixAdmin admin = new ZKHelixAdmin(zkAddress);
+      admin.addCluster(clusterName, true);
+      StateModelConfigGenerator generator = new StateModelConfigGenerator();
+      admin.addStateModelDef(clusterName, "OnlineOffline",
+          new StateModelDefinition(generator.generateConfigForOnlineOffline()));
+      admin.addResource(clusterName, lockGroupName, numPartitions,
+          "OnlineOffline", IdealStateModeProperty.AUTO_REBALANCE.toString());
+      admin.rebalance(clusterName, lockGroupName, 1);
+      for (int i = 0; i < numInstances; i++)
+      {
+        final String instanceName = "localhost_" + (12000 + i);
+        processArray[i] = new Thread(new Runnable()
+        {
+
+          @Override
+          public void run()
+          {
+            LockProcess lockProcess = null;
+
+            try
+            {
+              lockProcess = new LockProcess(clusterName, zkAddress,
+                  instanceName, startController);
+              lockProcess.start();
+              Thread.currentThread().join();
+            } catch (InterruptedException e)
+            {
+              System.out.println(instanceName + "Interrupted");
+              if (lockProcess != null)
+              {
+                lockProcess.stop();
+              }
+            } catch (Exception e)
+            {
+              e.printStackTrace();
+            }
+          }
+
+        });
+        processArray[i].start();
+      }
+      Thread.sleep(3000);
+      controllerManager = HelixControllerMain.startHelixController(zkAddress,
+          clusterName, "controller", HelixControllerMain.STANDALONE);
+      Thread.sleep(5000);
+      printStatus(admin, clusterName, lockGroupName);
+      System.out.println("Stopping localhost_12000");
+      processArray[0].interrupt();
+      Thread.sleep(3000);
+      printStatus(admin, clusterName, lockGroupName);
+      Thread.currentThread().join();
+    } catch (Exception e)
+    {
+      e.printStackTrace();
+    } finally
+    {
+      if (controllerManager != null)
+      {
+        controllerManager.disconnect();
+      }
+      for (Thread process : processArray)
+      {
+        if (process != null)
+        {
+          process.interrupt();
+        }
+      }
+    }
+  }
+
+  private static void printStatus(HelixAdmin admin, String cluster,
+      String resource)
+  {
+    ExternalView externalView = admin
+        .getResourceExternalView(cluster, resource);
+    //System.out.println(externalView);
+    TreeSet<String> treeSet = new TreeSet<String>(
+        externalView.getPartitionSet());
+    System.out.println("lockName" + "\t" + "acquired By");
+    System.out.println("======================================");
+    for (String lockName : treeSet)
+    {
+      Map<String, String> stateMap = externalView.getStateMap(lockName);
+      String acquiredBy = null;
+      if (stateMap != null)
+      {
+        for(String instanceName:stateMap.keySet()){
+          if ("ONLINE".equals(stateMap.get(instanceName))){
+            acquiredBy = instanceName;
+            break;
+          }
+        }
+      }
+      System.out.println(lockName + "\t"
+          + ((acquiredBy != null) ? acquiredBy : "NONE"));
+    }
+  }
+
+  private static void startLocalZookeeper(int port) throws Exception
+  {
+    ZkServer server = null;
+    String baseDir = "/tmp/IntegrationTest/";
+    final String dataDir = baseDir + "zk/dataDir";
+    final String logDir = baseDir + "/tmp/logDir";
+    FileUtils.deleteDirectory(new File(dataDir));
+    FileUtils.deleteDirectory(new File(logDir));
+
+    IDefaultNameSpace defaultNameSpace = new IDefaultNameSpace()
+    {
+      @Override
+      public void createDefaultNameSpace(ZkClient zkClient)
+      {
+
+      }
+    };
+    int zkPort = 2199;
+    final String zkAddress = "localhost:" + zkPort;
+
+    server = new ZkServer(dataDir, logDir, defaultNameSpace, zkPort);
+    server.start();
+
+  }
+
+}
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/7b6790dc/recipes/distributed-lock-manager/src/main/java/org/apache/helix/lockmanager/LockProcess.java
----------------------------------------------------------------------
diff --git a/recipes/distributed-lock-manager/src/main/java/org/apache/helix/lockmanager/LockProcess.java b/recipes/distributed-lock-manager/src/main/java/org/apache/helix/lockmanager/LockProcess.java
new file mode 100644
index 0000000..d2db761
--- /dev/null
+++ b/recipes/distributed-lock-manager/src/main/java/org/apache/helix/lockmanager/LockProcess.java
@@ -0,0 +1,128 @@
+package org.apache.helix.lockmanager;
+
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+import java.util.List;
+
+import org.apache.helix.HelixManager;
+import org.apache.helix.HelixManagerFactory;
+import org.apache.helix.InstanceType;
+import org.apache.helix.controller.HelixControllerMain;
+import org.apache.helix.manager.zk.ZKHelixAdmin;
+import org.apache.helix.model.InstanceConfig;
+
+public class LockProcess
+{
+  private String clusterName;
+  private String zkAddress;
+  private String instanceName;
+  private HelixManager participantManager;
+  private boolean startController;
+  private HelixManager controllerManager;
+
+  LockProcess(String clusterName, String zkAddress, String instanceName,
+      boolean startController)
+  {
+    this.clusterName = clusterName;
+    this.zkAddress = zkAddress;
+    this.instanceName = instanceName;
+    this.startController = startController;
+
+  }
+
+  public void start() throws Exception
+  {
+    System.out.println("STARTING "+ instanceName);
+    configureInstance(instanceName);
+    participantManager = HelixManagerFactory.getZKHelixManager(clusterName,
+        instanceName, InstanceType.PARTICIPANT, zkAddress);
+    participantManager.getStateMachineEngine().registerStateModelFactory(
+        "OnlineOffline", new LockFactory());
+    participantManager.connect();
+    if (startController)
+    {
+      startController();
+    }
+    System.out.println("STARTED "+ instanceName);
+
+  }
+
+  private void startController()
+  {
+    controllerManager = HelixControllerMain.startHelixController(zkAddress,
+        clusterName, "controller", HelixControllerMain.STANDALONE);
+  }
+
+  /**
+   * Configure the instance, the configuration of each node is available to
+   * other nodes.
+   * 
+   * @param instanceName
+   */
+  private void configureInstance(String instanceName)
+  {
+    ZKHelixAdmin helixAdmin = new ZKHelixAdmin(zkAddress);
+
+    List<String> instancesInCluster = helixAdmin
+        .getInstancesInCluster(clusterName);
+    if (instancesInCluster == null || !instancesInCluster.contains(instanceName))
+    {
+      InstanceConfig config = new InstanceConfig(instanceName);
+      config.setHostName("localhost");
+      config.setPort("12000");
+      helixAdmin.addInstance(clusterName, config);
+    }
+  }
+
+  public void stop()
+  {
+    if (participantManager != null)
+    {
+      participantManager.disconnect();
+    }
+    
+    if (controllerManager != null)
+    {
+      controllerManager.disconnect();
+    }
+  }
+
+  public static void main(String[] args) throws Exception
+  {
+    String zkAddress = "localhost:2199";
+    String clusterName = "lock-manager-demo";
+    // Give a unique id to each process, most commonly used format hostname_port
+    final String instanceName = "localhost_12000";
+    boolean startController = true;
+    final LockProcess lockProcess = new LockProcess(clusterName, zkAddress,
+        instanceName, startController);
+    Runtime.getRuntime().addShutdownHook(new Thread()
+    {
+      @Override
+      public void run()
+      {
+        System.out.println("Shutting down " + instanceName);
+        lockProcess.stop();
+      }
+    });
+    lockProcess.start();
+    Thread.currentThread().join();
+  }
+}

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/7b6790dc/recipes/pom.xml
----------------------------------------------------------------------
diff --git a/recipes/pom.xml b/recipes/pom.xml
index 2fb84b1..bb6acab 100644
--- a/recipes/pom.xml
+++ b/recipes/pom.xml
@@ -32,6 +32,7 @@ under the License.
   <modules>
     <module>rabbitmq-consumer-group</module>
     <module>rsync-replicated-file-system</module>
+    <module>distributed-lock-manager</module>
   </modules>
 
   <build>

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/7b6790dc/recipes/rsync-replicated-file-system/README.md
----------------------------------------------------------------------
diff --git a/recipes/rsync-replicated-file-system/README.md b/recipes/rsync-replicated-file-system/README.md
index 31dd6e6..f8a74a0 100644
--- a/recipes/rsync-replicated-file-system/README.md
+++ b/recipes/rsync-replicated-file-system/README.md
@@ -1,44 +1,65 @@
+<!---
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
 Near real time rsync replicated file system
 ===========================================
 
 Quickdemo
-=========
+---------
 
 * This demo starts 3 instances with id's as ```localhost_12001, localhost_12002, localhost_12003```
-* Each instance stores its files under /tmp/<id>/filestore
+* Each instance stores its files under ```/tmp/<id>/filestore```
 * ``` localhost_12001 ``` is designated as the master and ``` localhost_12002 and localhost_12003``` are the slaves.
 * Files written to master are replicated to the slaves automatically. In this demo, a.txt and b.txt are written to ```/tmp/localhost_12001/filestore``` and it gets replicated to other folders.
 * When the master is stopped, ```localhost_12002``` is promoted to master. 
 * The other slave ```localhost_12003``` stops replicating from ```localhost_12001``` and starts replicating from new master ```localhost_12002```
 * Files written to new master ```localhost_12002``` are replicated to ```localhost_12003```
 * In the end state of this quick demo, ```localhost_12002``` is the master and ```localhost_12003``` is the slave. Manually create files under ```/tmp/localhost_12002/filestore``` and see that appears in ```/tmp/localhost_12003/filestore```
+* Ignore the interrupted exceptions on the console :-).
+
 
 ```
 git clone https://git-wip-us.apache.org/repos/asf/incubator-helix.git
 cd recipes/rsync-replicated-file-system/
-mvn clean install package
+mvn clean install package -DskipTests
 cd target/rsync-replicated-file-system-pkg/bin
+chmod +x *
 ./quickdemo
 
 ```
 
-
 Overview
-========
+--------
 
 There are many applications that require storage for storing large number of relatively small data files. Examples include media stores to store small videos, images, mail attachments etc. Each of these objects is typically kilobytes, often no larger than a few megabytes. An additional distinguishing feature of these usecases is also that files are typically only added or deleted, rarely updated. When there are updates, they are rare and do not have any concurrency requirements.
 
 These are much simpler requirements than what general purpose distributed file system have to satisfy including concurrent access to files, random access for reads and updates, posix compliance etc. To satisfy those requirements, general DFSs are also pretty complex that are expensive to build and maintain.
  
-A different implementation of a distributed file system includes HDFS which is inspired by Google�s GFS. This is one of the most widely used distributed file system that forms the main data storage platform for Hadoop. HDFS is primary aimed at processing very large data sets and distributes files across a cluster of commodity servers by splitting up files in fixed size chunks. HDFS is not particularly well suited for storing a very large number of relatively tiny files.
+A different implementation of a distributed file system includes HDFS which is inspired by Google's GFS. This is one of the most widely used distributed file system that forms the main data storage platform for Hadoop. HDFS is primary aimed at processing very large data sets and distributes files across a cluster of commodity servers by splitting up files in fixed size chunks. HDFS is not particularly well suited for storing a very large number of relatively tiny files.
 
 ### File Store
 
-It�s possible to build a vastly simpler system for the class of applications that have simpler requirements as we have pointed out.
+It's possible to build a vastly simpler system for the class of applications that have simpler requirements as we have pointed out.
 
 * Large number of files but each file is relatively small.
 * Access is limited to create, delete and get entire files.
-* No updates to files that are already created (or it�s feasible to delete the old file and create a new one).
+* No updates to files that are already created (or it's feasible to delete the old file and create a new one).
  
 
 We call this system a Partitioned File Store (PFS) to distinguish it from other distributed file systems. This system needs to provide the following features:
@@ -53,7 +74,7 @@ Apache Helix is a generic cluster management framework that makes it very easy t
 Rsync can be easily used as a replication channel between servers so that each file gets replicated on multiple servers.
 
 Design
-======
+------
 
 High level 
 
@@ -66,14 +87,13 @@ High level
 ### Transaction log
 
 Every write on the master will result in creation/deletion of one or more files. In order to maintain timeline consistency slaves need to apply the changes in the same order. 
-To facilitate this, the master logs each transaction in a file. Each transaction is associated with an id. The transaction id has two parts a generation and sequence. 
-For every transaction the sequence number increases and and generation increases when a new master is elected. 
+To facilitate this, the master logs each transaction in a file and each transaction is associated with an 64 bit id in which the 32 LSB represents a sequence number and MSB represents the generation number.
+Sequence gets incremented on every transaction and and generation is increment when a new master is elected. 
 
 ### Replication
 
-Replication is needed for the slave to keep up with the changes on the master. Every time the slave applies a change it checkpoints the last applied transaction id. 
-During restarts, this allows the slave the restart to pull changes from the last checkpointed transaction. Similar to a master, the slave logs each transaction to the transaction logs but 
-instead of generating new transaction id, it uses the same id generated by the master.
+Replication is required to slave to keep up with the changes on the master. Every time the slave applies a change it checkpoints the last applied transaction id. 
+During restarts, this allows the slave to pull changes from the last checkpointed id. Similar to master, the slave logs each transaction to the transaction logs but instead of generating new transaction id, it uses the same id generated by the master.
 
 
 ### Fail over
@@ -83,7 +103,50 @@ changes from previous master before taking up mastership. The new master will re
 with sequence starting from 1. After this the master will begin accepting writes. 
 
 
-![Partitioned File Store](images/system.png)
+![Partitioned File Store](../images/PFS-Generic.png)
+
+
+
+Rsync based solution
+-------------------
+
+![Rsync based File Store](../images/RSYNC_BASED_PFS.png)
+
+
+This application demonstrate a file store that uses rsync as the replication mechanism. One can envision a similar system where instead of using rsync, 
+can implement a custom solution to notify the slave of the changes and also provide an api to pull the change files.
+#### Concept
+* file_store_dir: Root directory for the actual data files 
+* change_log_dir: The transaction logs are generated under this folder.
+* check_point_dir: The slave stores the check points ( last processed transaction) here.
+
+#### Master
+* File server: This component support file uploads and downloads and writes the files to ```file_store_dir```. This is not included in this application. Idea is that most applications have different ways of implementing this component and has some business logic associated with it. It is not hard to come up with such a component if needed.
+* File store watcher: This component watches the ```file_store_dir``` directory on the local file system for any changes and notifies the registered listeners of the changes.
+* Change Log Generator: This registers as a listener of File System Watcher and on each notification logs the changes into a file under ```change_log_dir```. 
+
+####Slave
+* File server: This component on the slave will only support reads.
+* Cluster state observer: Slave observes the cluster state and is able to know who is the current master. 
+* Replicator: This has two subcomponents
+    - Periodic rsync of change log: This is a background process that periodically rsyncs the ```change_log_dir``` of the master to its local directory
+    - Change Log Watcher: This watches the ```change_log_dir``` for changes and notifies the registered listeners of the change
+    - On demand rsync invoker: This is registered as a listener to change log watcher and on every change invokes rsync to sync only the changed file.
+
+
+#### Coordination
+
+The coordination between nodes is done by Helix. Helix does the partition management and assigns the partition to multiple nodes based on the replication factor. It elects one the nodes as master and designates others as slaves.
+It provides notifications to each node in the form of state transitions ( Offline to Slave, Slave to Master). It also provides notification when there is change is cluster state. 
+This allows the slave to stop replicating from current master and start replicating from new master. 
+
+In this application, we have only one partition but its very easy to extend it to support multiple partitions. By partitioning the file store, one can add new nodes and Helix will automatically 
+re-distribute partitions among the nodes. To summarize, Helix provides partition management, fault tolerance and facilitates automated cluster expansion.
+
+
+
+
+
 
 
 

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/7b6790dc/src/site/markdown/recipes/lock_manager.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/recipes/lock_manager.md b/src/site/markdown/recipes/lock_manager.md
new file mode 100644
index 0000000..5ef55d9
--- /dev/null
+++ b/src/site/markdown/recipes/lock_manager.md
@@ -0,0 +1,180 @@
+<!---
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+Distributed lock manager
+------------------------
+Distributed locks are used to synchronize accesses shared resources. Most applications use Zookeeper to model the distributed locks. 
+
+The simplest way to model a lock using zookeeper is (See Zookeeper leader recipe for an exact and more advanced solution)
+
+* Each process tries to create an emphemeral node.
+* If can successfully create it then, it acquires the lock
+* Else it will watch on the znode and try to acquire the lock again.
+
+This is good enough if there is only one lock. But in practice, an application will have many such locks. Distributing and managing the locks among difference process becomes challenging. Extending such a solution to many locks will result in
+
+* Uneven distribution of locks among nodes, the node that starts first will acquire all the lock. Nodes that start later will be idle.
+* When a node fails, how the locks will be distributed among remaining nodes is not predicable. 
+* When new nodes are added the current nodes dont relinquish the locks so that new nodes can acquire some locks
+
+In other words we want a system to satisfy the following requirements.
+
+* Distribute locks evenly among all nodes to get better hardware utilization
+* If a node fails, the locks that were acquired by that node should be evenly distributed among other nodes
+* If nodes are added, locks must be evenly re-distributed among nodes.
+
+Helix provides a simple and elegant solution to this problem. Simply specify the number of locks and Helix will ensure that above constraints are satisfied. 
+
+To quickly see this working run the lock-manager-demo script where 12 locks are evenly distributed among three nodes, and when a node fails, the locks get re-distributed among remaining two nodes. Note that Helix does not re-shuffle the locks completely, instead it simply distributes the locks relinquished by dead node among 2 remaining nodes evenly.
+
+#### Quick version
+ This version starts multiple threads with in same process to simulate a multi node deployment. Try the long version to get a better idea of how it works.
+ 
+```
+git clone https://git-wip-us.apache.org/repos/asf/incubator-helix.git
+cd incubator-helix
+mvn clean install package -DskipTests
+cd recipes/distributed-lock-manager/target/distributed-lock-manager-pkg/bin
+chmod +x *
+./lock-manager-demo
+
+``` 
+
+#### Long version
+This provides more details on how to setup the cluster and where to plugin application code.
+
+#### start zookeeper
+
+```
+./start-standalone-zookeeper 2199
+```
+
+#### Create a cluster
+
+```
+./helix-admin --zkSvr localhost:2199 --addCluster lock-manager-demo
+```
+
+#### Create a lock group
+
+Create a lock group and specify the number of locks in the lock group. You can change add new locks dynamically later. 
+```
+./helix-admin --zkSvr localhost:2199  --addResource lock-manager-demo lock-group 6 OnlineOffline AUTO_REBALANCE
+```
+
+#### Start the nodes
+
+Create a Lock class that handles the callbacks. 
+
+```
+public class Lock extends StateModel
+{
+  private String lockName;
+
+  public Lock(String lockName)
+  {
+    this.lockName = lockName;
+  }
+
+  public void lock(Message m, NotificationContext context)
+  {
+    System.out.println(" acquired lock:"+ lockName );
+  }
+
+  public void release(Message m, NotificationContext context)
+  {
+    System.out.println(" releasing lock:"+ lockName );
+  }
+
+}
+
+```
+
+LockFactory that creates the lock 
+```
+public class LockFactory extends StateModelFactory<Lock>{
+	/* Instantiates the lock handler, one per lockName*/
+	
+    public Lock create(String lockName)
+    {
+    	return new Lock(lockName);
+    }   
+}
+```
+Thats it, now when the node starts simply join the cluster and helix will invoke the appropriate call backs on Lock.
+
+```
+public class MyClass{
+
+	public static void main(String args){
+	    String zkAddress= "localhost:2199";
+	    String clusterName = "lock-manager-demo";
+	    //Give a unique id to each process, most commonly used format hostname_port
+	    String instanceName ="localhost_12000";
+	    ZKHelixAdmin helixAdmin = new ZKHelixAdmin(zkAddress);
+	    //configure the instance and provide some metadata 
+	    InstanceConfig config = new InstanceConfig(instanceName);
+	    config.setHostName("localhost");
+	    config.setPort("12000");
+	    admin.addInstance(clusterName, config);
+	    //join the cluster
+		HelixManager manager = HelixManager.getHelixManager(clusterName,null,InstanceType.PARTICIPANT,zkAddress);
+		manager.getStateMachineEngine().registerStateModelFactory("OnlineOffline", modelFactory);
+		manager.connect();
+		Thread.currentThread.join();
+	}
+
+}
+```
+
+#### Start the controller
+
+Controller can be started either as a separate process or can be embedded within each node process
+
+##### Separate process
+This is recommended when number of nodes in the cluster >100. For fault tolerance, you can run multiple controllers on different boxes.
+
+```
+./run-helix-controller --zkSvr localhost:2199 --cluster mycluster 2>&1 > /tmp/controller.log &
+```
+
+##### Embedded within the node process
+This is recommended when the number of nodes in the cluster is less than 100. To start a controller from each process, simply add the following lines to MyClass
+
+```
+public class MyClass{
+
+	public static void main(String args){
+	    String zkAddress= "localhost:2199";
+	    String clusterName = "lock-manager-demo";
+	    .
+	    .
+	    manager.connect();
+	    final HelixManager controller = HelixControllerMain.startHelixController(zkAddress, clusterName, "controller", HelixControllerMain.STANDALONE);
+	    Thread.currentThread.join();
+	}
+}
+```
+
+
+
+
+
+
+
+

http://git-wip-us.apache.org/repos/asf/incubator-helix/blob/7b6790dc/src/site/resources/images/helix-logo.jpg
----------------------------------------------------------------------
diff --git a/src/site/resources/images/helix-logo.jpg b/src/site/resources/images/helix-logo.jpg
new file mode 100644
index 0000000..d6428f6
Binary files /dev/null and b/src/site/resources/images/helix-logo.jpg differ


Mime
View raw message