trafodion-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gunnar Tapper <gunnar.tap...@esgyn.com>
Subject RE: Re: Re: installer: ERROR (line 1): Unknown statement 0;node-name=node1;cores=;processors=1;roles=connection,aggregation,storage .
Date Fri, 28 Aug 2015 05:30:51 GMT
Hi,

Maybe stating the obvious... You might not even have a choice on a faster
node when doing a hardware replacement since the older model might simply
not be available. This is especially true for larger clusters where the size
drives meantime between failure rates -- it'll be almost impossible to keep
the cluster heterogeneous.

Same goes for disks, which typically have a shorter meantime between failure
due to technology limitations and sheer numbers.

Thanks,

Gunnar

-----Original Message-----
From: D. Markt [mailto:dmarkt7370@gmail.com]
Sent: Thursday, August 27, 2015 11:25 PM
To: 'Rohit' <rohit.jain@esgyn.com>; dev@trafodion.incubator.apache.org
Cc: 'Ming Liu' <ovis_poly@sina.com>
Subject: RE: Re: Re: installer: ERROR (line 1): Unknown statement
0;node-name=node1;cores=;processors=1;roles=connection,aggregation,storage .

  I’m not disagreeing but I think Gunnar stated my observation/suggestion
better than I did: “I'd strongly suggest striving for homogenous node
hardware within a role group, where possible.”.  My HBase experience shows
performance improves with data locality; that is, having the DN and RS
hosting the blocks/regions on the same node.  So even if you have a newer
node that is 2x in every respect compared to the older nodes you would have
to put 2x the number of regions on that node otherwise I don’t know (but
that doesn’t mean it can’t be done) how WMS could take full advantage of the
faster node.  Even if that can be done, you have to consider the HA aspects
because you will be replicating those 2x regions to older nodes.  So if that
new node fails the overall system workload would go down by more than 1 node’s
worth of processing.  Which is why I suggested homogeneous nodes simplifies
many things.



  That doesn’t mean adding or replacing an existing node with a newer,
faster node should be avoided; add the node and you’ll get as good if not
better performance than before.  But it will take more effort and
forethought to configure the HBase/HDFS roles and region placements to take
full advantage of the extra processing power.



Regards,

Dennis



From: Rohit [mailto:rohit.jain@esgyn.com]
Sent: Thursday, August 27, 2015 11:33 PM
To: D. Markt <dmarkt7370@gmail.com>; dev@trafodion.incubator.apache.org
Cc: 'Ming Liu' <ovis_poly@sina.com>
Subject: RE: Re: Re: installer: ERROR (line 1): Unknown statement
0;node-name=node1;cores=;processors=1;roles=connection,aggregation,storage .



Hadoop's strength is it's elastic scalability. That means that as the
customer grows their cluster, they will end up with the newer nodes being
more powerful than the older ones. There was always a thought that WMS would
understand this difference and leverage it to load balance workloads,
especially when using adaptive segmentation. But for operational workloads
this is even simpler - load the more powerful nodes with more transactions,
or more CPU intensive, or more memory intensive queries. Of course, we don't
do that now, and it will take some effort, but WMS always had the foundation
of this capability built into it. Maybe Rao can comment.



Rohit



-------- Original message --------
From: "D. Markt" <dmarkt7370@gmail.com <mailto:dmarkt7370@gmail.com> >
Date: 08/27/2015 5:49 PM (GMT-06:00)
To: dev@trafodion.incubator.apache.org
<mailto:dev@trafodion.incubator.apache.org>
Cc: 'Ming Liu' <ovis_poly@sina.com <mailto:ovis_poly@sina.com> >
Subject: RE: Re: Re: installer: ERROR (line 1): Unknown statement
0;node-name=node1;cores=;processors=1;roles=connection,aggregation,storage .

  I don't think there's anything to correct; I would just say this is
probably a legacy concept that might not be as useful for Trafodion.
Originally, pre-Trafodion, sqconfig was created by the user to describe the
cluster and how it was to be used (e.g., the roles attribute).  It looks
like the Trafodion installer makes the simplifying assumption that the nodes
are homogeneous.  I'm not sure that any Trafodion process currently uses any
of the attributes other than to know which nodes are in the cluster.  But if
they are I would assume the sqconfig could be modified by the user and sqgen
executed to update the environment.

  Regardless, the operational issues of running HBase on heterogeneous nodes
would remain even without Trafodion being considered.  For example, if one
node were to be significantly more powerful or have more memory than another
node and they both hosted Region Servers then you would probably not see
that benefit for any application that used both nodes.  But in an HA
configuration where several nodes have no Region Servers, those nodes could
be different than the nodes hosting Region Servers and that should be fine.
But for nodes hosting similar components, a homogeneous configuration makes
everything a bit easier.

Regards,
Dennis

-----Original Message-----
From: Hans Zeller [mailto:hans.zeller@esgyn.com]
Sent: Thursday, August 27, 2015 4:32 PM
To: dev <dev@trafodion.incubator.apache.org
<mailto:dev@trafodion.incubator.apache.org> >
Cc: Ming Liu <ovis_poly@sina.com <mailto:ovis_poly@sina.com> >
Subject: Re: Re: Re: installer: ERROR (line 1): Unknown statement
0;node-name=node1;cores=;processors=1;roles=connection,aggregation,storage .

Hi Radu, yes, it looks at only one node. Generally, we don't recommend
clusters made from different hardware. But, in practice, that may not always
be avoidable (e.g. when I have older 6-core CPUs and the currently available
servers I can buy all have 8 cores). Ideally, I would repurpose the existing
hardware and get a homogeneous set of nodes for the upgraded cluster.

As far as I know, the number of cores is just used as a guideline.
Therefore, you could have a cluster with a mix of hardware configurations
and it should still work, as long as the nodes are fairly similar.
Performance of such a cluster may not be optimal, though.

I hope others will correct me if I'm wrong.


Hans

On Thu, Aug 27, 2015 at 9:27 AM, Radu Marias <radumarias@gmail.com
<mailto:radumarias@gmail.com> > wrote:

> I notices that cores and cpus are queried only once, for the installer
> node and added in config file for all of the nodes. So this imply that
> all the nodes have the same cpu? it's not possible to have nodes with
> other cpu config?
>
> On Thu, Aug 27, 2015 at 5:55 PM, Radu Marias <radumarias@gmail.com
> <mailto:radumarias@gmail.com> > wrote:
>
> > Managed to have the installer working by getting the cores and cpus
> > from /proc/cpuinfo.
> >
> > Attached is my patch, I think it's better for Amanda or someone to
> > review it and add it to git if it's considered so.
> >
> > Now I'm having other errors with sqstart, but will start another
> > thread for that.
> >
> > On Thu, Aug 27, 2015 at 11:38 AM, Liu Ming <ovis_poly@sina.com
> > <mailto:ovis_poly@sina.com> > wrote:
> >
> >> This is very great help Radu! I agree with you that there are other
> >> more robust ways to get core-number as you found out,  than calling
> >> lscpu,
> so I
> >> feel it will be great that you can help to improve this code by
> >> commit
> your
> >> change to Trafodion?
> >> thanks,Ming
> >> ----- Original Message -----
> >> From: Radu Marias <radumarias@gmail.com
> >> <mailto:radumarias@gmail.com> >
> >> To: ovis_poly@sina.com <mailto:ovis_poly@sina.com> , dev
> >> <dev@trafodion.incubator.apache.org>
> >> Subject: Re: Re: installer: ERROR (line 1): Unknown statement
> >>
> 0;node-name=node1;cores=;processors=1;roles=connection,aggregation,storage
> .
> >> Date: 2015-08-27 16:22
> >>
> >> I have lscpu installed but all nodes are virtual machines so it
> >> might
> be a
> >> limitation of VM, I have this output:
> >> *# lscpu *
> >> *lscpu: failed to determine number of CPUs:
> >> /sys/devices/system/cpu/possible: No such file or directory* I'm
> >> trying now to determine the number of cpus with other commands and
> >> change the installed script for that. LIke here or other
> >> alternatives
> >>
> >>
> http://stackoverflow.com/questions/6481005/how-to-obtain-the-number-of
> -cpus-cores-in-linux-from-the-command-line
> >> On Thu, Aug 27, 2015 at 10:59 AM Liu Ming <ovis_poly@sina.com
> >> <mailto:ovis_poly@sina.com> > wrote:
> >> > hi, Radu,
> >> > As Amanda suggested, sqconfig file contains some issue, that is a
> >> > auto-generated config file by installer. The exact error is in
> >> > the
> >> 'cores'
> >> > column as I check the error message from you message.
> >> > 'cores' is generated by parsing the output of command 'lscpu'.
> >> > Would
> you
> >> > please paste the output of lscpu in your system as well?Maybe you
> don't
> >> > have lscpu installed , then you can install it: yum install
> >> util-linux-ng
> >> > A 'basic server' installation of CentOS should have that package.
> >> > So
> if
> >> > that is not installed, I am curious what option you choose when
> >> > you
> >> install
> >> > the CentOS ? As mentioned in Trafodion wiki, the recommanded OS
> >> > is
> >> CentOS
> >> > 6.x, I am not clear if Trafodion can work well on CentOS 7.
> >> > If it is already installed, let us check the output of lscpu .
> >> > thanks,Ming----- Original Message -----
> >> > From: Amanda Moran <amanda.moran@esgyn.com
> >> > <mailto:amanda.moran@esgyn.com> >
> >> > To: dev <dev@trafodion.incubator.apache.org
> >> > <mailto:dev@trafodion.incubator.apache.org> >
> >> > Subject: Re: installer: ERROR (line 1): Unknown statement
> >> >
> >>
> 0;node-name=node1;cores=;processors=1;roles=connection,aggregation,storage
> .
> >> > Date: 2015-08-27 06:46
> >> >
> >> > Your sqconfig file has not been generated properly. Can you copy
> >> > the
> >> file
> >> > into an email <path>installer/sqconfig_node1? This file is
> >> > generated
> by
> >> the
> >> > script <path>/installer/traf_sqconfig.
> >> > On Wed, Aug 26, 2015 at 3:41 PM, Radu Marias
> >> > <radumarias@gmail.com <mailto:radumarias@gmail.com> >
> >> wrote:
> >> > > Hi,
> >> > >
> >> > > I have a cluster of 5 nodes, each as a virtual machine.
> >> > > This is on them:
> >> > > Centos 7
> >> > > Ambari 2.1
> >> > > HDP 2.2
> >> > > perl.x86_64 4:5.16.3-285.el7
> >> > > jdk1.7.0_67, installed by ambari
> >> > >
> >> > > Running the installer I have these errors after hdp is
> >> > > restarted and trafodion folder are created to nodes:
> >> > >
> >> > > ***INFO: Trafodion Mods ran successfully.
> >> > >
> >> > > ******************************
> >> > >  TRAFODION START
> >> > > ******************************
> >> > >
> >> > > /usr/lib/trafodion/installer/..
> >> > > ***INFO: Log file location
> >> > > /var/log/trafodion/trafodion_install_2015-08-26-19-44-39.log
> >> > > ***INFO: traf_start
> >> > > ******************************************
> >> > > ******************************************
> >> > > ******************************************
> >> > > ******************************************
> >> > > /home/trafodion/trafodion-20150821_0830
> >> > > ***INFO: untarring build file
> >> > >
> /usr/lib/trafodion/trafodion-20150821_0830/trafodion_server-1.2.0.tgz
> >> to
> >> > > /home/trafodion/trafodion-20150821_0830
> >> > > ***INFO: modifying .bashrc to set Trafodion environment
> >> > > variables
> >> > > ***INFO: copying .bashrc file to all nodes
> >> > > ***INFO: copying sqconfig file (/home/trafodion/sqconfig) to
> >> > > /home/trafodion/trafodion-20150821_0830/sql/scripts/sqconfig
> >> > > ***INFO: Creating /home/trafodion/trafodion-20150821_0830
> >> > > directory
> on
> >> > all
> >> > > nodes
> >> > > ***INFO: starting sqgen
> >> > > node1,node2,node3,node4,node5
> >> > >
> >> > > Creating directories on cluster nodes /usr/bin/pdsh -R exec -w
> >> > > node1,node2,node3,node4,node5 -x node5 ssh
> >> -q -n
> >> > > %h mkdir -p /home/trafodion/trafodion-20150821_0830/etc
> >> > > /usr/bin/pdsh -R exec -w node1,node2,node3,node4,node5 -x node5
> >> > > ssh
> >> -q -n
> >> > > %h mkdir -p /home/trafodion/trafodion-20150821_0830/logs
> >> > > /usr/bin/pdsh -R exec -w node1,node2,node3,node4,node5 -x node5
> >> > > ssh
> >> -q -n
> >> > > %h mkdir -p /home/trafodion/trafodion-20150821_0830/tmp
> >> > > /usr/bin/pdsh -R exec -w node1,node2,node3,node4,node5 -x node5
> >> > > ssh
> >> -q -n
> >> > > %h mkdir -p /home/trafodion/trafodion-20150821_0830/sql/scripts
> >> > >
> >> > > The SQ environment variable file
> >> > > /home/trafodion/trafodion-20150821_0830/etc/ms.env exists.
> >> > > The file will not be re-generated.
> >> > >
> >> > >
> >> > > *       ERROR (line 1):  Unknown statement 0;node-*
> >> > > *               name=node1;cores=;processors=1;ro*
> >> > > *               les=connection,aggregation,storage .*
> >> > > *Note: Using cluster.conf format type 2.* *For
> >> > >
> >> > >
> >> >
> >>
> "node-id=0;node-name=node1;cores=;processors=1;roles=connection,aggreg
> ation,storage*
> >> > > *":*
> >> > > *   Error: Enclosure name not specified*
> >> > > *   Error: Enclosure node list not specified*
> >> > > *   Error: not a valid node configuration statement.*
> >> > > Exiting without generating cluster.conf due to errors.
> >> > > ***ERROR: sqgen failed with RC=1. Check install log file for
> details.
> >> > > ***ERROR: Error while running traf_start.
> >> > > ***ERROR: Setup not complete, review logs.
> >> > > ***ERROR: Exiting....
> >> > >
> >> > > Also in order to get this far I had to also install some perl
> modules:
> >> > >
> >> > > *yum -y install perl-version.x86_64* *yum -y install
> >> > > perl-DBI.x86_64* *yum -y install DBD::SQLite*
> >> > >
> >> > > And add ctime.pl to /usr/share/perl5/. File from
> >> > >
> >> https://github.com/dwimperl/perl-5.12.3.0/blob/master/perl/lib/ctim
> >> e.pl
> >> > >
> >> > > --
> >> > > And in the end, it's not the years in your life that count.
> >> > > It's the
> >> life
> >> > > in your years.
> >> > >
> >> > --
> >> > Thanks,
> >> > Amanda Moran
> >> >
> >>
> >
> >
> >
> > --
> > And in the end, it's not the years in your life that count. It's the
> > life in your years.
> >
>
>
>
> --
> And in the end, it's not the years in your life that count. It's the
> life in your years.
>

Mime
View raw message