From user-return-756-apmail-hama-user-archive=hama.apache.org@hama.apache.org Wed Dec 26 13:53:58 2012 Return-Path: X-Original-To: apmail-hama-user-archive@www.apache.org Delivered-To: apmail-hama-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E48E1EF74 for ; Wed, 26 Dec 2012 13:53:58 +0000 (UTC) Received: (qmail 87112 invoked by uid 500); 26 Dec 2012 13:53:58 -0000 Delivered-To: apmail-hama-user-archive@hama.apache.org Received: (qmail 87073 invoked by uid 500); 26 Dec 2012 13:53:58 -0000 Mailing-List: contact user-help@hama.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hama.apache.org Delivered-To: mailing list user@hama.apache.org Received: (qmail 87035 invoked by uid 99); 26 Dec 2012 13:53:57 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 26 Dec 2012 13:53:57 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of menonsuraj5@gmail.com designates 209.85.215.50 as permitted sender) Received: from [209.85.215.50] (HELO mail-la0-f50.google.com) (209.85.215.50) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 26 Dec 2012 13:53:50 +0000 Received: by mail-la0-f50.google.com with SMTP id c1so10763565lah.37 for ; Wed, 26 Dec 2012 05:53:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type; bh=3Msg/onYnkKhss83Z0Svc1LhpBsHXsFlaSkjbXx+yGE=; b=YXo6xrixyJ3YMLn9yyrE2BIXgET1A3Ax8GjEmsZM2yFQbpiJBhTSrFEh9+CcGU+nUO 3OL13XZzs3L3nxJIRCDKtXyzy3ebdJ41atGgOcNsmEWiho99mtsxDqCyv17wHhIjFmtr HTRN3s3z6nNo5RhgzMavAQVEg6527M1TeI2MQQ1rh6Skjh+6vQIWvE1iY5E9OgXi2mOA ILqrXpyhoOfc2qXGZSWt3cd8rHGV1l7EB73oML9VrmZPeJ6QaIAwgPsRmrIdBqnGdvw2 xqlyMyvlQ6rUJuYkFyarcUGhOXUlNYjRrSefrxQzfMzrnllrpfC4tk73Dv5Zma6BzNnl E3GA== MIME-Version: 1.0 Received: by 10.152.108.48 with SMTP id hh16mr25255478lab.25.1356530008984; Wed, 26 Dec 2012 05:53:28 -0800 (PST) Sender: menonsuraj5@gmail.com Received: by 10.114.79.196 with HTTP; Wed, 26 Dec 2012 05:53:28 -0800 (PST) In-Reply-To: References: Date: Wed, 26 Dec 2012 08:53:28 -0500 X-Google-Sender-Auth: 8MloISfYbziIHoe1GEL7s7itpFg Message-ID: Subject: Re: What is the best configuration for Cluster hama distributed mode From: Suraj Menon To: user@hama.apache.org Content-Type: multipart/alternative; boundary=bcaec54fb9d2500d4904d1c1c1c1 X-Virus-Checked: Checked by ClamAV on apache.org --bcaec54fb9d2500d4904d1c1c1c1 Content-Type: text/plain; charset=ISO-8859-1 Hi Francis, First I would like to know if anyone has some documentation a bit more > comprehensive cluster configuration hama? > On AWS EC2, you can use Apache Whirr to configure the cluster. Edward may share his procedures on maintaining his Hama cluster on Oracle BDA. > I would also like some information about the cluster configuration HAMA as: > > 1) I have a cluster with 12 computers in HDFS which the optimal > configuration of replication? configured to create 3 replicas of files, > this is the best? > > This depends on your availability requirements and the capacity of your cluster. 3 would be good if you cannot tolerate data-loss. You would have to work this out depending on the size of data and the capacity of your cluster. > 2) In my hama-site.xml for the best cluster configuration parameter > hama.zookeeper.quorum? 1 node 2 nodes, 3 nodes. > > Once again this depends on your availability requirements and the usage of cluster. > 3) When I process my graph with just over 65 000 vertices got the following > error: > attempt_201212260904_0005_000031_0: Exception in thread "pool-2-thread-1" > java.lang.OutOfMemoryError: GC overhead limit exceeded > attempt_201212260904_0005_000031_0: Exception in thread "Thread-1" > java.lang.OutOfMemoryError: GC overhead limit exceeded > > Is there any parameter I change more increase the memory limit? Or my > cluster will not be able to process this amount of information? With > smaller graphs it works correctly. I'm working with the all-pairs problem. > As reported recently by other users, Hama is facing scalability issues. I am trying to close - https://issues.apache.org/jira/browse/HAMA-559 and some other message object lifecycle issues.(Today we create a new Writable object for every message read and received.) Also , we keep all the vertices in the memory. However, you can change your JVM arguments. Please look at what you can do with the configuration parameter - bsp.child.java.opts. The default value could be found in hama-default.xml. Regards, Suraj --bcaec54fb9d2500d4904d1c1c1c1--