metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Sirota <jsir...@apache.org>
Subject Re: [DISCUSS] Ambari Metron Configuration Management consequences and call to action
Date Mon, 16 Jan 2017 20:40:58 GMT
In my view the live configs should live in Zookeeper.  It's basically what it's designed for.
 However, we also have a need for CM of these configs in case you want to roll back or push
a different config set into Zookeeper.  That's what I would use Ambari for...have the ability
to take a config out of CM and push it into Zookeeper...or snapshot a config out of Zookeeper
and push it into CM.  The obvious pre-requisite to having this capability is to not rely on
local storage or HDFS for any config.  So in my mind this is a 2-step transition.  Step 1
- transition all current configs into Zookeeper.  Step 2 - integrate config management with
Ambari.

I think passing usernames/passwords to stellar functions is not a feasible solution at this
point 

Thanks,
James 

15.01.2017, 18:28, "JJ Meyer" <jjmeyer0@gmail.com>:
> Quite late to the party, but with all this great back and forth I felt like
> I had to join in :)
>
> I believe SolrCloud uses ZooKeeper to manage most of its configuration
> files. When searching, I was only able to find this (
> https://cwiki.apache.org/confluence/display/solr/Using+ZooKeeper+to+Manage+Configuration+Files).
> I wasn't able to find any initial discussion on their architecture. If we
> can find more we still may be able to learn from them.
>
> Also, on the idea of passing a username/password to a Stellar function or
> to some shell script. We may want to do it a bit differently or at least
> give the option to do it differently. I know supplying the
> username/password directly is easy when testing and playing around, but it
> probably isn't going to be allowed for a user in production. Maybe we can
> also support a credentials file and eventually support encrypting sensitive
> values in configs?
>
> Thanks,
> JJ
>
> On Sun, Jan 15, 2017 at 1:26 PM, Michael Miklavcic <
> michael.miklavcic@gmail.com> wrote:
>
>>  Ha, I was betrayed by copy/paste in Chrome.
>>
>>  On Thu, Jan 12, 2017 at 7:24 PM, Matt Foley <mattf@apache.org> wrote:
>>
>>>  Mike, could you try again on the image, please, making sure it is a
>>>  simple format (gif, png, or jpeg)? It got munched, at least in my viewer.
>>>  Thanks.
>>>
>>>  Casey, responding to some of the questions you raised:
>>>
>>>  I’m going to make a rather strong statement: We already have a service
>>>  “to intermediate and handle config update/retrieval”.
>>>  Furthermore, it:
>>>  - Correctly handles the problems of distributed services running on
>>>  multi-node clusters. (That’s a HARD problem, people, and we shouldn’t try
>>>  to reinvent the wheel.)
>>>  - Correctly handles Kerberos security. (That’s kinda hard too, or at
>>>  least a lot of work.)
>>>  - It does automatic versioning of configurations, and allows viewing,
>>>  comparing, and reverting historical configs
>>>  - It has a capable REST API for all those things.
>>>  It doesn’t natively integrate Zookeeper storage of configs, but there is
>>>  a natural place to specify copy to/from Zookeeper for the files desired.
>>>
>>>  It is Ambari. And we should commit to it, rather than try to re-create
>>>  such features.
>>>  Because it has a good REST API, it is perfectly feasible to implement
>>>  Stellar functions that call it.
>>>  GUI configuration tools can also use the Ambari APIs, or better yet be
>>>  integrated in an “Ambari View”. (Eg, see the “Yarn Capacity Scheduler
>>>  Configuration Tool” example in the Ambari documentation, under “Using
>>>  Ambari Views”.)
>>>
>>>  Arguments are: Parsimony, Sufficiency, Not reinventing the wheel, and Not
>>>  spending weeks and weeks of developer time over the next year reinventing
>>>  the wheel while getting details wrong multiple times…
>>>
>>>  Okay, off soapbox.
>>>
>>>  Casey asked what the config update behavior of Ambari is, and how it will
>>>  interact with changes made from outside Ambari.
>>>  The following is from my experience working with the Ambari Mpack for
>>>  Metron. I am not otherwise an Ambari expert, so tomorrow I’ll get it
>>>  reviewed by an Ambari development engineer.
>>>
>>>  Ambari-server runs on one node, and Ambari-agent runs on each of all the
>>>  nodes.
>>>  Ambari-server has a private set of py, xml, and template files, which
>>>  together are used both to generate the Ambari configuration GUI, with
>>>  defaults, and to generate configuration files (of any needed filetype) for
>>>  the various Stack components.
>>>  Ambari-server also has a database where it stores the schema related to
>>>  these files, so even if you reach in and edit Ambari’s files, it will Error
>>>  out if the set of parameters or parameter names changes. The historical
>>>  information about configuration changes is also stored in the db.
>>>  For each component (and in the case of Metron, for each topology), there
>>>  is a python file which controls the logic for these actions, among others:
>>>  - Install
>>>  - Start / stop / restart / status
>>>  - Configure
>>>
>>>  It is actually up to this python code (which we wrote for the Metron
>>>  Mpack) what happens in each of these API calls. But the current code, and
>>>  I believe this is typical of Ambari-managed components, performs a
>>>  “Configure” action whenever you press the “Save” button after changing
a
>>>  component config in Ambari, and also on each Install and Start or Restart.
>>>
>>>  The Configure action consists of approximately the following sequence
>>>  (see disclaimer above :-)
>>>  - Recreate the generated config files, using the template files and the
>>>  actual configuration most recently set in Ambari
>>>  o Note this is also under the control of python code that we wrote, and
>>>  this is the appropriate place to push to ZK if desired.
>>>  - Propagate those config files to each Ambari-agent, with a command to
>>>  set them locally
>>>  - The ambari-agents on each node receive the files and write them to the
>>>  specified locations on local storage
>>>
>>>  Ambari-server then whines that the updated services should be restarted,
>>>  but does not initiate that action itself (unless of course the initiating
>>>  action was a Start command from the administrator).
>>>
>>>  Make sense? It’s all quite straightforward in concept, there’s just an
>>>  awful lot of stuff wrapped around that to make it all go smoothly and
>>>  handle the problems when it doesn’t.
>>>
>>>  There’s additional complexity in that the Ambari-agent also caches (on
>>>  each node) both the template files and COMPILED forms of the python files
>>>  (.pyc) involved in transforming them. The pyc files incorporate some
>>>  amount of additional info regarding parameter values, but I’m not sure of
>>>  the form. I don’t think that changes the above in any practical way unless
>>>  you’re trying to cheat Ambari by reaching in and editing its files
>>>  directly. In that case, you also need to whack the pyc files (on each
>>>  node) to force the data to be reloaded from Ambari-server. Best solution
>>>  is don’t cheat.
>>>
>>>  Also, there may be circumstances under which the Ambari-agent will detect
>>>  changes and re-write the latest version it knows of the config files, even
>>>  without a Save or Start action at the Ambari-server. I’m not sure of this
>>>  and need to check with Ambari developers. It may no longer happen, altho
>>>  I’m pretty sure change detection/reversion was a feature of early versions
>>>  of Ambari.
>>>
>>>  Hope this helps,
>>>  --Matt
>>>
>>>  ================================================
>>>  From: Michael Miklavcic <michael.miklavcic@gmail.com>
>>>  Reply-To: "dev@metron.incubator.apache.org" <
>>>  dev@metron.incubator.apache.org>
>>>  Date: Thursday, January 12, 2017 at 3:59 PM
>>>  To: "dev@metron.incubator.apache.org" <dev@metron.incubator.apache.org>
>>>  Subject: Re: [DISCUSS] Ambari Metron Configuration Management
>>>  consequences and call to action
>>>
>>>  Hi Casey,
>>>
>>>  Thanks for starting this thread. I believe you are correct in your
>>>  assessment of the 4 options for updating configs in Metron. When using more
>>>  than one of these options we can get into a split-brain scenario. A basic
>>>  example is updating the global config on disk and using the
>>>  zk_load_configs.sh. Later, if a user decides to restart Ambari, the cached
>>>  version stored by Ambari (it's in the MySQL or other database backing
>>>  Ambari) will be written out to disk in the defined config directory, and
>>>  subsequently loaded using the zk_load_configs.sh under the hood. Any global
>>>  configuration modified outside of Ambari will be lost at this point. This
>>>  is obviously undesirable, but I also like the purpose and utility exposed
>>>  by the multiple config management interfaces we currently have available. I
>>>  also agree that a service would be best.
>>>
>>>  For reference, here's my understanding of the current configuration
>>>  loading mechanisms and their deps.
>>>
>>>  <image>
>>>
>>>  Mike
>>>
>>>  On Thu, Jan 12, 2017 at 3:08 PM, Casey Stella <cestella@gmail.com> wrote:
>>>
>>>  In the course of discussion on the PR for METRON-652
>>>  <https://github.com/apache/incubator-metron/pull/415> something that
I
>>>  should definitely have understood better came to light and I thought that
>>>  it was worth bringing to the attention of the community to get
>>>  clarification/discuss is just how we manage configs.
>>>
>>>  Currently (assuming the management UI that Ryan Merriman submitted)
>>>  configs
>>>  are managed/adjusted via a couple of different mechanism.
>>>
>>>     - zk_load_utils.sh: pushed and pulled from disk to zookeeper
>>>     - Stellar REPL: pushed and pulled via the CONFIG_GET/CONFIG_PUT
>>>  functions
>>>     - Ambari: initialized via the zk_load_utils script and then some of
>>>  them
>>>     are managed directly (global config) and some indirectly
>>>  (sensor-specific
>>>     configs).
>>>        - NOTE: Upon service restart, it may or may not overwrite changes
on
>>>        disk or on zookeeper. *Can someone more knowledgeable than me about
>>>        this describe precisely the semantics that we can expect on
>>>  service restart
>>>        for Ambari? What gets overwritten on disk and what gets updated
>>>  in ambari?*
>>>     - The Management UI: manages some of the configs. *RYAN: Which configs
>>>     do we support here and which don't we support here?*
>>>
>>>  As you can see, we have a mishmash of mechanisms to update and manage the
>>>  configuration for Metron in zookeeper. In the beginning the approach was
>>>  just to edit configs on disk and push/pull them via zk_load_utils.
>>>  Configs
>>>  could be historically managed using source control, etc. As we got more
>>>  and more components managing the configs, we haven't taken care that they
>>>  they all work with each other in an expected way (I believe these are
>>>  true..correct me if I'm wrong):
>>>
>>>     - If configs are modified in the management UI or the Stellar REPL and
>>>     someone forgets to pull the configs from zookeeper to disk, before
>>>  they do
>>>     a push via zk_load_utils, they will clobber the configs in zookeeper
>>>  with
>>>     old configs.
>>>     - If the global config is changed on disk and the ambari service
>>>     restarts, it'll get reset with the original global config.
>>>     - *Ryan, in the management UI, if someone changes the zookeeper configs
>>>     from outside, are those configs reflected immediately in the UI?*
>>>
>>>  It seems to me that we have a couple of options here:
>>>
>>>     - A service to intermediate and handle config update/retrieval and
>>>     tracking historical changes so these different mechanisms can use a
>>>  common
>>>     component for config management/tracking and refactor the existing
>>>     mechanisms to use that service
>>>     - Standardize on exactly one component to manage the configs and
>>>  regress
>>>     the others (that's a verb, right? nicer than delete.)
>>>
>>>  I happen to like the service approach, myself, but I wanted to put it up
>>>  for discussion and hopefully someone will volunteer to design such a
>>>  thing.
>>>
>>>  To frame the debate, I want us to keep in mind a couple of things that may
>>>  or may not be relevant to the discussion:
>>>
>>>     - We will eventually be moving to support kerberos so there should at
>>>     least be a path to use kerberos for any solution IMO
>>>     - There is value in each of the different mechanisms in place now. If
>>>     there weren't, then they wouldn't have been created. Before we try to
>>>  make
>>>     this a "there can be only one" argument, I'd like to hear very good
>>>     arguments.
>>>
>>>  Finally, I'd appreciate if some people might answer the questions I have
>>>  in
>>>  bold there. Hopefully this discussion, if nothing else happens, will
>>>  result in fodder for proper documentation of the ins and outs of each of
>>>  the components bulleted above.
>>>
>>>  Best,
>>>
>>>  Casey

------------------- 
Thank you,

James Sirota
PPMC- Apache Metron (Incubating)
jsirota AT apache DOT org

Mime
View raw message