metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Casey Stella <ceste...@gmail.com>
Subject Re: [DISCUSS] Ambari Metron Configuration Management consequences and call to action
Date Fri, 13 Jan 2017 14:14:57 GMT
So, the reason to have the push operations push to ambari and then have
ambari sync to zookeeper (btw: is this possible, do we have a hook like
that in ambari?) is to make sure that users can specify a comment about
what changed, correct?  If we pushed to zookeeper and had ambari listen
(not sure it can do that either, btw) and update itself, we wouldn't be
able to specify reasons.

Casey

On Fri, Jan 13, 2017 at 9:09 AM, David Lyle <dlyle65535@gmail.com> wrote:

> The only tooling I'm aware of that Ambari isn't already using is the
> Stellar stuff, is there more?
>
> Regardless, I'd always push from Ambari to zookeeper and let other tooling
> talk to Ambari (Casey's first bullet). The only wrinkle is we have to
> decide if we want to support manual installation. Fwiw, I do. If we did,
> we'd need to do a bit of mode selection to support both. But the happy path
> would be to do stuff (human or machine) via Ambari.
>
> -D...
>
>
> On Fri, Jan 13, 2017 at 9:01 AM, Casey Stella <cestella@gmail.com> wrote:
>
> > Just piling on in support for Ambari.  I really, really don't like
> > reinventing wheels, especially hard ones.  I guess my questions now are
> > mainly around technical feasibility.  Seems to me that we can either:
> >
> >    - retrofit the tooling that currently manages configs to use the
> Ambari
> >    API's as well as pushing to zokeeper
> >    - have a service listening to zookeeper and pushing changes to ambari
> to
> >    keep it in sync
> >    - Something that I may have missed
> >
> > Each of those have pro's and con's.  Thoughts?
> >
> > Casey
> >
> > On Fri, Jan 13, 2017 at 8:53 AM, David Lyle <dlyle65535@gmail.com>
> wrote:
> >
> > > I'm in complete agreement with all the points Matt made. I think the
> way
> > > forward should be to expose ALL user-modifiable configs via Ambari and
> > let
> > > Ambari actively manage them. We should keep the command line tools as
> the
> > > backend and Ambari should continue to leverage them. This will allow
> > manual
> > > installation/management if desired and will ensure the command line
> > scripts
> > > are kept up to date.
> > >
> > > Fully leveraging Ambari has many beneficial effects. My top four:
> > >    Provides proper revision control for the configurations
> > >    Scales easily into things like rolling|quick upgrades and Kerberos
> > > support
> > >    Provides other applications a restful endpoint to change
> > configurations
> > >    We get a force multiplier from the Ambari devs
> > >
> > > The working description Matt provided is completely consistent with my
> > > understanding of how it works (derived from the Ambari docs, authoring
> > > pieces of the mpack and interacting with some Ambari devs). Restarting
> > > Ambari agent is the only circumstance I'm aware of outside of
> > > save/start|restart that would initiate a re-write of the configs and
> > cache,
> > > there could be others.
> > >
> > > -D...
> > >
> > > On Thu, Jan 12, 2017 at 9:24 PM, Matt Foley <mattf@apache.org> wrote:
> > >
> > > > Mike, could you try again on the image, please, making sure it is a
> > > simple
> > > > format (gif, png, or jpeg)?  It got munched, at least in my viewer.
> > > Thanks.
> > > >
> > > > Casey, responding to some of the questions you raised:
> > > >
> > > > I’m going to make a rather strong statement:  We already have a
> service
> > > > “to intermediate and handle config update/retrieval”.
> > > > Furthermore, it:
> > > > - Correctly handles the problems of distributed services running on
> > > > multi-node clusters.  (That’s a HARD problem, people, and we
> shouldn’t
> > > try
> > > > to reinvent the wheel.)
> > > > - Correctly handles Kerberos security. (That’s kinda hard too, or at
> > > least
> > > > a lot of work.)
> > > > - It does automatic versioning of configurations, and allows viewing,
> > > > comparing, and reverting historical configs
> > > > - It has a capable REST API for all those things.
> > > > It doesn’t natively integrate Zookeeper storage of configs, but there
> > is
> > > a
> > > > natural place to specify copy to/from Zookeeper for the files
> desired.
> > > >
> > > > It is Ambari.  And we should commit to it, rather than try to
> re-create
> > > > such features.
> > > > Because it has a good REST API, it is perfectly feasible to implement
> > > > Stellar functions that call it.
> > > > GUI configuration tools can also use the Ambari APIs, or better yet
> be
> > > > integrated in an “Ambari View”. (Eg, see the “Yarn Capacity Scheduler
> > > > Configuration Tool” example in the Ambari documentation, under “Using
> > > > Ambari Views”.)
> > > >
> > > > Arguments are: Parsimony, Sufficiency, Not reinventing the wheel, and
> > Not
> > > > spending weeks and weeks of developer time over the next year
> > reinventing
> > > > the wheel while getting details wrong multiple times…
> > > >
> > > > Okay, off soapbox.
> > > >
> > > > Casey asked what the config update behavior of Ambari is, and how it
> > will
> > > > interact with changes made from outside Ambari.
> > > > The following is from my experience working with the Ambari Mpack for
> > > > Metron.  I am not otherwise an Ambari expert, so tomorrow I’ll get it
> > > > reviewed by an Ambari development engineer.
> > > >
> > > > Ambari-server runs on one node, and Ambari-agent runs on each of all
> > the
> > > > nodes.
> > > > Ambari-server has a private set of py, xml, and template files, which
> > > > together are used both to generate the Ambari configuration GUI, with
> > > > defaults, and to generate configuration files (of any needed
> filetype)
> > > for
> > > > the various Stack components.
> > > > Ambari-server also has a database where it stores the schema related
> to
> > > > these files, so even if you reach in and edit Ambari’s files, it will
> > > Error
> > > > out if the set of parameters or parameter names changes.  The
> > historical
> > > > information about configuration changes is also stored in the db.
> > > > For each component (and in the case of Metron, for each topology),
> > there
> > > > is a python file which controls the logic for these actions, among
> > > others:
> > > > - Install
> > > > - Start / stop / restart / status
> > > > - Configure
> > > >
> > > > It is actually up to this python code (which we wrote for the Metron
> > > > Mpack) what happens in each of these API calls.  But the current
> code,
> > > and
> > > > I believe this is typical of Ambari-managed components, performs a
> > > > “Configure” action whenever you press the “Save” button after
> changing
> > a
> > > > component config in Ambari, and also on each Install and Start or
> > > Restart.
> > > >
> > > > The Configure action consists of approximately the following sequence
> > > (see
> > > > disclaimer above :-)
> > > > - Recreate the generated config files, using the template files and
> the
> > > > actual configuration most recently set in Ambari
> > > > o Note this is also under the control of python code that we wrote,
> and
> > > > this is the appropriate place to push to ZK if desired.
> > > > - Propagate those config files to each Ambari-agent, with a command
> to
> > > set
> > > > them locally
> > > > - The ambari-agents on each node receive the files and write them to
> > the
> > > > specified locations on local storage
> > > >
> > > > Ambari-server then whines that the updated services should be
> > restarted,
> > > > but does not initiate that action itself (unless of course the
> > initiating
> > > > action was a Start command from the administrator).
> > > >
> > > > Make sense?  It’s all quite straightforward in concept, there’s just
> an
> > > > awful lot of stuff wrapped around that to make it all go smoothly and
> > > > handle the problems when it doesn’t.
> > > >
> > > > There’s additional complexity in that the Ambari-agent also caches
> (on
> > > > each node) both the template files and COMPILED forms of the python
> > files
> > > > (.pyc) involved in transforming them.  The pyc files incorporate some
> > > > amount of additional info regarding parameter values, but I’m not
> sure
> > of
> > > > the form.  I don’t think that changes the above in any practical way
> > > unless
> > > > you’re trying to cheat Ambari by reaching in and editing its files
> > > > directly.  In that case, you also need to whack the pyc files (on
> each
> > > > node) to force the data to be reloaded from Ambari-server.  Best
> > solution
> > > > is don’t cheat.
> > > >
> > > > Also, there may be circumstances under which the Ambari-agent will
> > detect
> > > > changes and re-write the latest version it knows of the config files,
> > > even
> > > > without a Save or Start action at the Ambari-server.  I’m not sure of
> > > this
> > > > and need to check with Ambari developers.  It may no longer happen,
> > altho
> > > > I’m pretty sure change detection/reversion was a feature of early
> > > versions
> > > > of Ambari.
> > > >
> > > > Hope this helps,
> > > > --Matt
> > > >
> > > > ================================================
> > > > From: Michael Miklavcic <michael.miklavcic@gmail.com>
> > > > Reply-To: "dev@metron.incubator.apache.org"
> > > <dev@metron.incubator.apache.
> > > > org>
> > > > Date: Thursday, January 12, 2017 at 3:59 PM
> > > > To: "dev@metron.incubator.apache.org" <dev@metron.incubator.apache.
> org
> > >
> > > > Subject: Re: [DISCUSS] Ambari Metron Configuration Management
> > > consequences
> > > > and call to action
> > > >
> > > > Hi Casey,
> > > >
> > > > Thanks for starting this thread. I believe you are correct in your
> > > > assessment of the 4 options for updating configs in Metron. When
> using
> > > more
> > > > than one of these options we can get into a split-brain scenario. A
> > basic
> > > > example is updating the global config on disk and using the
> > > > zk_load_configs.sh. Later, if a user decides to restart Ambari, the
> > > cached
> > > > version stored by Ambari (it's in the MySQL or other database backing
> > > > Ambari) will be written out to disk in the defined config directory,
> > and
> > > > subsequently loaded using the zk_load_configs.sh under the hood. Any
> > > global
> > > > configuration modified outside of Ambari will be lost at this point.
> > This
> > > > is obviously undesirable, but I also like the purpose and utility
> > exposed
> > > > by the multiple config management interfaces we currently have
> > > available. I
> > > > also agree that a service would be best.
> > > >
> > > > For reference, here's my understanding of the current configuration
> > > > loading mechanisms and their deps.
> > > >
> > > > <image>
> > > >
> > > > Mike
> > > >
> > > >
> > > > On Thu, Jan 12, 2017 at 3:08 PM, Casey Stella <cestella@gmail.com>
> > > wrote:
> > > >
> > > > In the course of discussion on the PR for METRON-652
> > > > <https://github.com/apache/incubator-metron/pull/415> something
> that I
> > > > should definitely have understood better came to light and I thought
> > that
> > > > it was worth bringing to the attention of the community to get
> > > > clarification/discuss is just how we manage configs.
> > > >
> > > > Currently (assuming the management UI that Ryan Merriman submitted)
> > > configs
> > > > are managed/adjusted via a couple of different mechanism.
> > > >
> > > >    - zk_load_utils.sh: pushed and pulled from disk to zookeeper
> > > >    - Stellar REPL: pushed and pulled via the CONFIG_GET/CONFIG_PUT
> > > > functions
> > > >    - Ambari: initialized via the zk_load_utils script and then some
> of
> > > them
> > > >    are managed directly (global config) and some indirectly
> > > > (sensor-specific
> > > >    configs).
> > > >       - NOTE: Upon service restart, it may or may not overwrite
> changes
> > > on
> > > >       disk or on zookeeper.  *Can someone more knowledgeable than me
> > > about
> > > >       this describe precisely the semantics that we can expect on
> > > > service restart
> > > >       for Ambari? What gets overwritten on disk and what gets updated
> > > > in ambari?*
> > > >    - The Management UI: manages some of the configs. *RYAN: Which
> > configs
> > > >    do we support here and which don't we support here?*
> > > >
> > > > As you can see, we have a mishmash of mechanisms to update and manage
> > the
> > > > configuration for Metron in zookeeper.  In the beginning the approach
> > was
> > > > just to edit configs on disk and push/pull them via zk_load_utils.
> > > Configs
> > > > could be historically managed using source control, etc.  As we got
> > more
> > > > and more components managing the configs, we haven't taken care that
> > they
> > > > they all work with each other in an expected way (I believe these are
> > > > true..correct me if I'm wrong):
> > > >
> > > >    - If configs are modified in the management UI or the Stellar REPL
> > and
> > > >    someone forgets to pull the configs from zookeeper to disk, before
> > > they
> > > > do
> > > >    a push via zk_load_utils, they will clobber the configs in
> zookeeper
> > > > with
> > > >    old configs.
> > > >    - If the global config is changed on disk and the ambari service
> > > >    restarts, it'll get reset with the original global config.
> > > >    - *Ryan, in the management UI, if someone changes the zookeeper
> > > configs
> > > >    from outside, are those configs reflected immediately in the UI?*
> > > >
> > > >
> > > > It seems to me that we have a couple of options here:
> > > >
> > > >    - A service to intermediate and handle config update/retrieval and
> > > >    tracking historical changes so these different mechanisms can use
> a
> > > > common
> > > >    component for config management/tracking and refactor the existing
> > > >    mechanisms to use that service
> > > >    - Standardize on exactly one component to manage the configs and
> > > regress
> > > >    the others (that's a verb, right?   nicer than delete.)
> > > >
> > > > I happen to like the service approach, myself, but I wanted to put it
> > up
> > > > for discussion and hopefully someone will volunteer to design such a
> > > thing.
> > > >
> > > > To frame the debate, I want us to keep in mind a couple of things
> that
> > > may
> > > > or may not be relevant to the discussion:
> > > >
> > > >    - We will eventually be moving to support kerberos so there should
> > at
> > > >    least be a path to use kerberos for any solution IMO
> > > >    - There is value in each of the different mechanisms in place now.
> > If
> > > >    there weren't, then they wouldn't have been created.  Before we
> try
> > to
> > > > make
> > > >    this a "there can be only one" argument, I'd like to hear very
> good
> > > >    arguments.
> > > >
> > > > Finally, I'd appreciate if some people might answer the questions I
> > have
> > > in
> > > > bold there.  Hopefully this discussion, if nothing else happens, will
> > > > result in fodder for proper documentation of the ins and outs of each
> > of
> > > > the components bulleted above.
> > > >
> > > > Best,
> > > >
> > > > Casey
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message