metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mattf-horton <...@git.apache.org>
Subject [GitHub] incubator-metron pull request #425: METRON 609 Enhance Mpack to handle singl...
Date Thu, 26 Jan 2017 08:40:15 GMT
GitHub user mattf-horton opened a pull request:

    https://github.com/apache/incubator-metron/pull/425

    METRON 609 Enhance Mpack to handle single-node and small-cluster installs of Elasticsearch

    This PR is not ready for prime time, but is provided for ease of access to work-in-progress
for:
    - METRON-609 Enhance Mpack to handle single-node and small-cluster installs of Elasticsearch,
and 
    - METRON-634 Mpack bug fixes and improvements (not related to singlenode install).  
    
    These are presented as two separate commits, so you can look at them separately if you
wish.
    
    These are the included enhancements from METRON-609:
    - Enable 1-, 2-, and 3-node clusters to have a working Elasticsearch install via the Mpack.
    	- Change constraints from 1+ Masters and 3+ Slaves, to 1+ and 0+.
    	- Allow non-dedicated master/datanodes via boolean "masters_also_are_datanodes".
    	- Allow use of alternative single-node template via "single_node_elasticsearch" boolean.
    	- Only the 1- and 4-node clusters have been tested, last month.
    - Improve various mouse-over Description fields in the GUI.
    - I included the attempted validation check on (storm) num_slots = slots_per_supervisor
* num_supervisors.  This doesn't currently work due to pre-existing bug in other parts of
validation check, so haven't been able to test.
    
    These are the included enhancements and bug fixes from METRON-634:    
    NOT AFFECTING THE AMBARI DATABASE:
    - ES pid_dir specification and usage:
    	- Currently pid_dir is multiply specified in elastic-env.xml and params.py. The config
parameter should not be over-ridden in params.py.
    	- PID_DIR failed to be included in /etc/sysconfig/elasticsearch. It needs to be added
to the template in elastic-sysconfig, as it must be provided to ES at launch-time (else the
default directory will be used).
    	- pid_file is specified in params.py, but is not used anywhere. (The ES internal launcher
synthesizes it from PID_DIR, and this is appropriate.)
    - JAVA_HOME needs to be provided in /etc/sysconfig/elasticsearch (templated in elastic-sysconfig.xml).
Its absence causes Centos7 systemctl to fail the ES launch, unless /bin/java is defined (which
it isn't necessarily).
    - Also in the /etc/sysconfig/elasticsearch template in elastic-sysconfig.xml, the value
of ES_JAVA_OPTS incorrectly spans 3 lines. The lines must be terminated with backslashes to
effectively become a single line. The current inclusion of newlines in the long string value
is acceptable (although unusual) in shellscript, but not in a systemd EnvironmentFile. /etc/sysconfig/elasticsearch
must function as both.
    - Also in ES_JAVA_OPTS, the two instances of log_dir needs to be followed by a slash '/'
    - In elastic.py, when directories are being pre-created and permissions set, the file
$CONF_DIR/scripts should also be pre-created. I intermittently hit permissions issues with
this directory being created later by root, and not properly assigned to elastic_user.
    - In several places in elastic.py, "params.elastic_user" is incorrectly used when "params.user_group"
should be used.
    - Undefined "format()" method is used in elastic.py, unnecessarily in File(format("/etc/sysconfig/elasticsearch")...
    - Undefined "format()" method is similarly used several times unnecessarily in elastic_master.py
    - The comments and descriptions in elastic-site.xml have multiple suggested improvements.
    - Provide Quick Links in Ambari service page for Elasticsearch to the self-report pages
for ES health and ES node list. (very useful for debugging)
    
    CHANGES THAT DO AFFECT THE AMBARI DATABASE:
    - pid_dir SHOULD be specified in elastic-sysconfig.xml, rather than elastic-env.xml, as
it is a parameter that must be provided to ES at launch-time, but is not something there's
any reason for the admin to change in usual circumstances.
    - conf_dir SHOULD be specified in elastic-env.xml or elastic-site.xml, not in elastic-sysconfig.xml.
While it too is a parameter that must be provided to ES at launch-time, it is typically left
to the installing admin where to put the config files.
    - The Ambari configuration parameter names in elastic-site.xml should be improved in several
instances to make the semantics more obvious to the human reader (who may not be real familiar
with Elasticsearch configuration). Mouse-over documentation will continue to provide the ES
config parameter equivalents. In particular, suggest:
    	- cluster_name -> es_cluster_name  (to distinguish ES cluster from Stack cluster)
    	- zen_discovery_ping_unicast_hosts -> es_cluster_hosts
    	- network_host -> network_bindings  (these are in fact interface names, not host names)
    - There are at least two places in elasticsearch.master.yaml.j2 (zen_discovery_ping_unicast_hosts
and network_host) where needed square brackets are either missing or included in the configuration
string. To be consistent with other usages, and less prone to human error, the square brackets
should not be in the configuration string but rather should be provided in the template text.
    - In METRON/0.3.0/configuration/metron-env.xml and METRON/0.3.0/package/scripts/params/params_linux.py,
the value "metron_apps_indexed_hdfs_dir" does not need to be settable by admin; it is appropriate
to require it to be subordinate to "metron_apps_hdfs_dir". Thus it can be removed from metron-env.xml
and set to "{metron_apps_hdfs_dir}/indexing/indexed" in params_linux.py. This also eliminates
a really unacceptable use of "double format".
    
    NOTE that these changes, because they affect the database, should properly be accompanied
by a database update script and a version increment in the Mpack version number.  This is
not currently implemented.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mattf-horton/incubator-metron METRON-609

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-metron/pull/425.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #425
    
----
commit 0fd12a5bab2745e7c496657ef92b792b60faf2bf
Author: mattf-horton <mfoley@hortonworks.com>
Date:   2017-01-25T22:39:23Z

    METRON-609 Enhance Mpack to handle single-node and small-cluster installs of Elasticsearch.
 Work in Progress, at request of David Lyle.

commit 1af5376d59fe4c1812bda519e9b960dc74fdb0d6
Author: mattf-horton <mfoley@hortonworks.com>
Date:   2017-01-26T07:41:04Z

    METRON-634 Mpack bug fixes and improvements (not related to singlenode install). Partial:
all improvements from METRON-634 already proved out in METRON-608.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message