trafodion-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gunnar Tapper <gunnar.tap...@esgyn.com>
Subject RE: Re: Re: installer: ERROR (line 1): Unknown statement 0;node-name=node1;cores=;processors=1;roles=connection,aggregation,storage .
Date Fri, 28 Aug 2015 00:41:37 GMT
Hi,

Please note that heterogeneous cluster configuration is the norm due to node
roles. For example:

- Master Nodes HDFS NameNode, HBase Master, etc.) use powerful scale-up
servers to avoid bottleneck issues.
- Slave Nodes (HDFS DataNode, HBase RegionServer, MapReduce TaskTrackers,
etc.) use less powerful servers since they exploit parallelism vs. scale up.
- The HP Hadoop Reference Architecture (Minotaur) uses more powerful nodes
for Storage Nodes (HDFS DataNode, HBase RegionServers) vs. Compute Nodes
(MapReduce TaskTrackers, etc.).
- Edge Nodes (public access such as Connectivity Services, Oozie, Pig,
HCatalog) are likely to be more powerful than Slave/Compute Nodes depending
on what services are running on the Edge Nodes.

I'd strongly suggest striving for homogenous node hardware within a role
group, where possible. If you don't have that, the least powerful node in a
role group will dictate the overall performance for the role. That's just
the nature of parallel computing. Naturally, smaller clusters can't afford
to have role-specific nodes. In such cases, it's a good idea to size the
nodes based on service placement understanding that some of the nodes will
be oversubscribed due to the Master Services mentioned above.

Thanks,

Gunnar

-----Original Message-----
From: D. Markt [mailto:dmarkt7370@gmail.com]
Sent: Thursday, August 27, 2015 4:50 PM
To: dev@trafodion.incubator.apache.org
Cc: 'Ming Liu' <ovis_poly@sina.com>
Subject: RE: Re: Re: installer: ERROR (line 1): Unknown statement
0;node-name=node1;cores=;processors=1;roles=connection,aggregation,storage .

  I don't think there's anything to correct; I would just say this is
probably a legacy concept that might not be as useful for Trafodion.
Originally, pre-Trafodion, sqconfig was created by the user to describe the
cluster and how it was to be used (e.g., the roles attribute).  It looks
like the Trafodion installer makes the simplifying assumption that the nodes
are homogeneous.  I'm not sure that any Trafodion process currently uses any
of the attributes other than to know which nodes are in the cluster.  But if
they are I would assume the sqconfig could be modified by the user and sqgen
executed to update the environment.

  Regardless, the operational issues of running HBase on heterogeneous nodes
would remain even without Trafodion being considered.  For example, if one
node were to be significantly more powerful or have more memory than another
node and they both hosted Region Servers then you would probably not see
that benefit for any application that used both nodes.  But in an HA
configuration where several nodes have no Region Servers, those nodes could
be different than the nodes hosting Region Servers and that should be fine.
But for nodes hosting similar components, a homogeneous configuration makes
everything a bit easier.

Regards,
Dennis

-----Original Message-----
From: Hans Zeller [mailto:hans.zeller@esgyn.com]
Sent: Thursday, August 27, 2015 4:32 PM
To: dev <dev@trafodion.incubator.apache.org>
Cc: Ming Liu <ovis_poly@sina.com>
Subject: Re: Re: Re: installer: ERROR (line 1): Unknown statement
0;node-name=node1;cores=;processors=1;roles=connection,aggregation,storage .

Hi Radu, yes, it looks at only one node. Generally, we don't recommend
clusters made from different hardware. But, in practice, that may not always
be avoidable (e.g. when I have older 6-core CPUs and the currently available
servers I can buy all have 8 cores). Ideally, I would repurpose the existing
hardware and get a homogeneous set of nodes for the upgraded cluster.

As far as I know, the number of cores is just used as a guideline.
Therefore, you could have a cluster with a mix of hardware configurations
and it should still work, as long as the nodes are fairly similar.
Performance of such a cluster may not be optimal, though.

I hope others will correct me if I'm wrong.


Hans

On Thu, Aug 27, 2015 at 9:27 AM, Radu Marias <radumarias@gmail.com> wrote:

> I notices that cores and cpus are queried only once, for the installer
> node and added in config file for all of the nodes. So this imply that
> all the nodes have the same cpu? it's not possible to have nodes with
> other cpu config?
>
> On Thu, Aug 27, 2015 at 5:55 PM, Radu Marias <radumarias@gmail.com> wrote:
>
> > Managed to have the installer working by getting the cores and cpus
> > from /proc/cpuinfo.
> >
> > Attached is my patch, I think it's better for Amanda or someone to
> > review it and add it to git if it's considered so.
> >
> > Now I'm having other errors with sqstart, but will start another
> > thread for that.
> >
> > On Thu, Aug 27, 2015 at 11:38 AM, Liu Ming <ovis_poly@sina.com> wrote:
> >
> >> This is very great help Radu! I agree with you that there are other
> >> more robust ways to get core-number as you found out,  than calling
> >> lscpu,
> so I
> >> feel it will be great that you can help to improve this code by
> >> commit
> your
> >> change to Trafodion?
> >> thanks,Ming
> >> ----- Original Message -----
> >> From: Radu Marias <radumarias@gmail.com>
> >> To: ovis_poly@sina.com, dev <dev@trafodion.incubator.apache.org>
> >> Subject: Re: Re: installer: ERROR (line 1): Unknown statement
> >>
> 0;node-name=node1;cores=;processors=1;roles=connection,aggregation,storage
> .
> >> Date: 2015-08-27 16:22
> >>
> >> I have lscpu installed but all nodes are virtual machines so it
> >> might
> be a
> >> limitation of VM, I have this output:
> >> *# lscpu *
> >> *lscpu: failed to determine number of CPUs:
> >> /sys/devices/system/cpu/possible: No such file or directory* I'm
> >> trying now to determine the number of cpus with other commands and
> >> change the installed script for that. LIke here or other
> >> alternatives
> >>
> >>
> http://stackoverflow.com/questions/6481005/how-to-obtain-the-number-of
> -cpus-cores-in-linux-from-the-command-line
> >> On Thu, Aug 27, 2015 at 10:59 AM Liu Ming <ovis_poly@sina.com> wrote:
> >> > hi, Radu,
> >> > As Amanda suggested, sqconfig file contains some issue, that is a
> >> > auto-generated config file by installer. The exact error is in
> >> > the
> >> 'cores'
> >> > column as I check the error message from you message.
> >> > 'cores' is generated by parsing the output of command 'lscpu'.
> >> > Would
> you
> >> > please paste the output of lscpu in your system as well?Maybe you
> don't
> >> > have lscpu installed , then you can install it: yum install
> >> util-linux-ng
> >> > A 'basic server' installation of CentOS should have that package.
> >> > So
> if
> >> > that is not installed, I am curious what option you choose when
> >> > you
> >> install
> >> > the CentOS ? As mentioned in Trafodion wiki, the recommanded OS
> >> > is
> >> CentOS
> >> > 6.x, I am not clear if Trafodion can work well on CentOS 7.
> >> > If it is already installed, let us check the output of lscpu .
> >> > thanks,Ming----- Original Message -----
> >> > From: Amanda Moran <amanda.moran@esgyn.com>
> >> > To: dev <dev@trafodion.incubator.apache.org>
> >> > Subject: Re: installer: ERROR (line 1): Unknown statement
> >> >
> >>
> 0;node-name=node1;cores=;processors=1;roles=connection,aggregation,storage
> .
> >> > Date: 2015-08-27 06:46
> >> >
> >> > Your sqconfig file has not been generated properly. Can you copy
> >> > the
> >> file
> >> > into an email <path>installer/sqconfig_node1? This file is
> >> > generated
> by
> >> the
> >> > script <path>/installer/traf_sqconfig.
> >> > On Wed, Aug 26, 2015 at 3:41 PM, Radu Marias
> >> > <radumarias@gmail.com>
> >> wrote:
> >> > > Hi,
> >> > >
> >> > > I have a cluster of 5 nodes, each as a virtual machine.
> >> > > This is on them:
> >> > > Centos 7
> >> > > Ambari 2.1
> >> > > HDP 2.2
> >> > > perl.x86_64 4:5.16.3-285.el7
> >> > > jdk1.7.0_67, installed by ambari
> >> > >
> >> > > Running the installer I have these errors after hdp is
> >> > > restarted and trafodion folder are created to nodes:
> >> > >
> >> > > ***INFO: Trafodion Mods ran successfully.
> >> > >
> >> > > ******************************
> >> > >  TRAFODION START
> >> > > ******************************
> >> > >
> >> > > /usr/lib/trafodion/installer/..
> >> > > ***INFO: Log file location
> >> > > /var/log/trafodion/trafodion_install_2015-08-26-19-44-39.log
> >> > > ***INFO: traf_start
> >> > > ******************************************
> >> > > ******************************************
> >> > > ******************************************
> >> > > ******************************************
> >> > > /home/trafodion/trafodion-20150821_0830
> >> > > ***INFO: untarring build file
> >> > >
> /usr/lib/trafodion/trafodion-20150821_0830/trafodion_server-1.2.0.tgz
> >> to
> >> > > /home/trafodion/trafodion-20150821_0830
> >> > > ***INFO: modifying .bashrc to set Trafodion environment
> >> > > variables
> >> > > ***INFO: copying .bashrc file to all nodes
> >> > > ***INFO: copying sqconfig file (/home/trafodion/sqconfig) to
> >> > > /home/trafodion/trafodion-20150821_0830/sql/scripts/sqconfig
> >> > > ***INFO: Creating /home/trafodion/trafodion-20150821_0830
> >> > > directory
> on
> >> > all
> >> > > nodes
> >> > > ***INFO: starting sqgen
> >> > > node1,node2,node3,node4,node5
> >> > >
> >> > > Creating directories on cluster nodes /usr/bin/pdsh -R exec -w
> >> > > node1,node2,node3,node4,node5 -x node5 ssh
> >> -q -n
> >> > > %h mkdir -p /home/trafodion/trafodion-20150821_0830/etc
> >> > > /usr/bin/pdsh -R exec -w node1,node2,node3,node4,node5 -x node5
> >> > > ssh
> >> -q -n
> >> > > %h mkdir -p /home/trafodion/trafodion-20150821_0830/logs
> >> > > /usr/bin/pdsh -R exec -w node1,node2,node3,node4,node5 -x node5
> >> > > ssh
> >> -q -n
> >> > > %h mkdir -p /home/trafodion/trafodion-20150821_0830/tmp
> >> > > /usr/bin/pdsh -R exec -w node1,node2,node3,node4,node5 -x node5
> >> > > ssh
> >> -q -n
> >> > > %h mkdir -p /home/trafodion/trafodion-20150821_0830/sql/scripts
> >> > >
> >> > > The SQ environment variable file
> >> > > /home/trafodion/trafodion-20150821_0830/etc/ms.env exists.
> >> > > The file will not be re-generated.
> >> > >
> >> > >
> >> > > *       ERROR (line 1):  Unknown statement 0;node-*
> >> > > *               name=node1;cores=;processors=1;ro*
> >> > > *               les=connection,aggregation,storage .*
> >> > > *Note: Using cluster.conf format type 2.* *For
> >> > >
> >> > >
> >> >
> >>
> "node-id=0;node-name=node1;cores=;processors=1;roles=connection,aggreg
> ation,storage*
> >> > > *":*
> >> > > *   Error: Enclosure name not specified*
> >> > > *   Error: Enclosure node list not specified*
> >> > > *   Error: not a valid node configuration statement.*
> >> > > Exiting without generating cluster.conf due to errors.
> >> > > ***ERROR: sqgen failed with RC=1. Check install log file for
> details.
> >> > > ***ERROR: Error while running traf_start.
> >> > > ***ERROR: Setup not complete, review logs.
> >> > > ***ERROR: Exiting....
> >> > >
> >> > > Also in order to get this far I had to also install some perl
> modules:
> >> > >
> >> > > *yum -y install perl-version.x86_64* *yum -y install
> >> > > perl-DBI.x86_64* *yum -y install DBD::SQLite*
> >> > >
> >> > > And add ctime.pl to /usr/share/perl5/. File from
> >> > >
> >> https://github.com/dwimperl/perl-5.12.3.0/blob/master/perl/lib/ctim
> >> e.pl
> >> > >
> >> > > --
> >> > > And in the end, it's not the years in your life that count.
> >> > > It's the
> >> life
> >> > > in your years.
> >> > >
> >> > --
> >> > Thanks,
> >> > Amanda Moran
> >> >
> >>
> >
> >
> >
> > --
> > And in the end, it's not the years in your life that count. It's the
> > life in your years.
> >
>
>
>
> --
> And in the end, it's not the years in your life that count. It's the
> life in your years.
>

Mime
View raw message