metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Casey Stella <ceste...@gmail.com>
Subject Re: [DISCUSS] Ambari Metron Configuration Management consequences and call to action
Date Fri, 13 Jan 2017 15:28:34 GMT
No, it was good to bring up, Nick.  I might have it wrong re: Ambari.

Casey

On Fri, Jan 13, 2017 at 10:27 AM, Nick Allen <nick@nickallen.org> wrote:

> That makes sense.  I wasn't sure based on Matt's original
> suggestion/description of Ambari, whether that was something that Ambari
> had also designed for or not.
>
> On Fri, Jan 13, 2017 at 10:14 AM, Casey Stella <cestella@gmail.com> wrote:
>
> > Polling the Ambari server via REST (or their API if they have one), would
> > entail all workers hitting one server and create a single point of
> failure
> > (the ambari server is what serves up REST).  Zookeeper's intent is to not
> > have a single point of failure like this and (one of its main) use-cases
> is
> > to serve up configs in a distributed environment.
> >
> > Casey
> >
> > On Fri, Jan 13, 2017 at 9:55 AM, Nick Allen <nick@nickallen.org> wrote:
> >
> > > Let me ask a stupid question.  What does Zookeeper do for us that
> Ambari
> > > cannot?  Why keep Zookeeper in the mix?
> > >
> > >
> > >
> > > On Fri, Jan 13, 2017 at 9:28 AM, David Lyle <dlyle65535@gmail.com>
> > wrote:
> > >
> > > > In the main yes- I've made some changes:
> > > >
> > > >  - Expand ambari to manage the remaining sensor-specific configs
> > > >  - Refactor the push calls to zookeeper (in ConfigurationUtils, I
> > think)
> > > >    to push to ambari and take an Ambari user/pw and (optionally)
> reason
> > > >  - (Ambari can push to zookeeper, but it requires a service restart,
> so
> > > for
> > > > "live changes" you may
> > > >     want do both a rest call and zookeeper update from
> > > ConfigurationUtils)
> > > >     WAS
> > > >     Question remains about whether ambari can do the push to
> zookeeper
> > > >     or whetheror whether ConfigurationUtils has to push to zookeeper
> as
> > > > well as update
> > > >     ambari.
> > > >   - Refactor the middleware that Ryan submitted to have the API calls
> > > take
> > > >      an Ambari user/pw and (optionally) reason
> > > >   - Refactor the management UI to pass in an Ambari user/pw and
> > > > (optionally) reason
> > > >   - Refactor the Stellar Management functions CONFIG_PUT to accept an
> > > > Ambari user/pw and (optionally) reason
> > > >
> > > > I think we'd need to do some detailed design around how to handle
> what
> > we
> > > > expect to be dynamic configs, but the main principle should (imo) be
> to
> > > > always know who and why and make sure that Ambari is aware and is the
> > > > static backing store for Zookeeper.
> > > >
> > > > -D...
> > > >
> > > >
> > > > On Fri, Jan 13, 2017 at 9:19 AM, Casey Stella <cestella@gmail.com>
> > > wrote:
> > > >
> > > > > So, basically, your proposed changes, broken into tangible gobbets
> of
> > > > work:
> > > > >
> > > > >    - Expand ambari to manage the remaining sensor-specific configs
> > > > >    - Refactor the push calls to zookeeper (in ConfigurationUtils,
I
> > > > think)
> > > > >    to push to ambari and take a reason
> > > > >       - Question remains about whether ambari can do the push to
> > > > zookeeper
> > > > >       or whether ConfigurationUtils has to push to zookeeper as
> well
> > as
> > > > > update
> > > > >       ambari.
> > > > >    - Refactor the middleware that Ryan submitted to have the API
> > calls
> > > > take
> > > > >    a reason
> > > > >    - Refactor the management UI to pass in a reason
> > > > >    - Refactor the Stellar Management functions CONFIG_PUT to
> accept a
> > > > > reason
> > > > >
> > > > > Just so we can evaluate it and I can ensure I haven't overlooked
> some
> > > > > important point.  Please tell me if Ambari cannot do the things
> we're
> > > > > suggesting it can do.
> > > > >
> > > > > Casey
> > > > >
> > > > > On Fri, Jan 13, 2017 at 9:15 AM, David Lyle <dlyle65535@gmail.com>
> > > > wrote:
> > > > >
> > > > > > That's exactly correct, Casey. Basically, an expansion of what
> > we're
> > > > > > currently doing with global.json, enrichment.properties and
> > > > > > elasticsearch.properties.
> > > > > >
> > > > > > -D...
> > > > > >
> > > > > >
> > > > > > On Fri, Jan 13, 2017 at 9:12 AM, Casey Stella <
> cestella@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > > I would suggest not having Ambari replace zookeeper.  I
think
> the
> > > > > > proposal
> > > > > > > is to have Ambari replace the editable store (like the
JSON
> files
> > > on
> > > > > > > disk).  Zookeeper woudl be the source of truth for the
running
> > > > > topologies
> > > > > > > and ambari would be sync'd to it.
> > > > > > >
> > > > > > > Correct if I misspeak, dave or matt.
> > > > > > >
> > > > > > > Casey
> > > > > > >
> > > > > > > On Fri, Jan 13, 2017 at 9:09 AM, Nick Allen <
> nick@nickallen.org>
> > > > > wrote:
> > > > > > >
> > > > > > > > Ambari seems like a logical choice.
> > > > > > > >
> > > > > > > > *>> It doesn’t natively integrate Zookeeper
storage of
> configs,
> > > but
> > > > > > there
> > > > > > > > is a natural place to specify copy to/from Zookeeper
for the
> > > files
> > > > > > > > desired.*
> > > > > > > >
> > > > > > > > How would Ambari interact with Zookeeper in this scenario?
> > Would
> > > > > > Ambari
> > > > > > > > replace Zookeeper completely? Or would Zookeeper act
as the
> > > > > persistence
> > > > > > > > tier under Ambari?
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, Jan 12, 2017 at 9:24 PM, Matt Foley <
> mattf@apache.org>
> > > > > wrote:
> > > > > > > >
> > > > > > > > > Mike, could you try again on the image, please,
making sure
> > it
> > > > is a
> > > > > > > > simple
> > > > > > > > > format (gif, png, or jpeg)?  It got munched,
at least in my
> > > > viewer.
> > > > > > > > Thanks.
> > > > > > > > >
> > > > > > > > > Casey, responding to some of the questions you
raised:
> > > > > > > > >
> > > > > > > > > I’m going to make a rather strong statement:
 We already
> > have a
> > > > > > service
> > > > > > > > > “to intermediate and handle config update/retrieval”.
> > > > > > > > > Furthermore, it:
> > > > > > > > > - Correctly handles the problems of distributed
services
> > > running
> > > > on
> > > > > > > > > multi-node clusters.  (That’s a HARD problem,
people, and
> we
> > > > > > shouldn’t
> > > > > > > > try
> > > > > > > > > to reinvent the wheel.)
> > > > > > > > > - Correctly handles Kerberos security. (That’s
kinda hard
> > too,
> > > or
> > > > > at
> > > > > > > > least
> > > > > > > > > a lot of work.)
> > > > > > > > > - It does automatic versioning of configurations,
and
> allows
> > > > > viewing,
> > > > > > > > > comparing, and reverting historical configs
> > > > > > > > > - It has a capable REST API for all those things.
> > > > > > > > > It doesn’t natively integrate Zookeeper storage
of configs,
> > but
> > > > > there
> > > > > > > is
> > > > > > > > a
> > > > > > > > > natural place to specify copy to/from Zookeeper
for the
> files
> > > > > > desired.
> > > > > > > > >
> > > > > > > > > It is Ambari.  And we should commit to it, rather
than try
> to
> > > > > > re-create
> > > > > > > > > such features.
> > > > > > > > > Because it has a good REST API, it is perfectly
feasible to
> > > > > implement
> > > > > > > > > Stellar functions that call it.
> > > > > > > > > GUI configuration tools can also use the Ambari
APIs, or
> > better
> > > > yet
> > > > > > be
> > > > > > > > > integrated in an “Ambari View”. (Eg, see
the “Yarn Capacity
> > > > > Scheduler
> > > > > > > > > Configuration Tool” example in the Ambari documentation,
> > under
> > > > > “Using
> > > > > > > > > Ambari Views”.)
> > > > > > > > >
> > > > > > > > > Arguments are: Parsimony, Sufficiency, Not reinventing
the
> > > wheel,
> > > > > and
> > > > > > > Not
> > > > > > > > > spending weeks and weeks of developer time over
the next
> year
> > > > > > > reinventing
> > > > > > > > > the wheel while getting details wrong multiple
times…
> > > > > > > > >
> > > > > > > > > Okay, off soapbox.
> > > > > > > > >
> > > > > > > > > Casey asked what the config update behavior of
Ambari is,
> and
> > > how
> > > > > it
> > > > > > > will
> > > > > > > > > interact with changes made from outside Ambari.
> > > > > > > > > The following is from my experience working with
the Ambari
> > > Mpack
> > > > > for
> > > > > > > > > Metron.  I am not otherwise an Ambari expert,
so tomorrow
> > I’ll
> > > > get
> > > > > it
> > > > > > > > > reviewed by an Ambari development engineer.
> > > > > > > > >
> > > > > > > > > Ambari-server runs on one node, and Ambari-agent
runs on
> each
> > > of
> > > > > all
> > > > > > > the
> > > > > > > > > nodes.
> > > > > > > > > Ambari-server has a private set of py, xml, and
template
> > files,
> > > > > which
> > > > > > > > > together are used both to generate the Ambari
configuration
> > > GUI,
> > > > > with
> > > > > > > > > defaults, and to generate configuration files
(of any
> needed
> > > > > > filetype)
> > > > > > > > for
> > > > > > > > > the various Stack components.
> > > > > > > > > Ambari-server also has a database where it stores
the
> schema
> > > > > related
> > > > > > to
> > > > > > > > > these files, so even if you reach in and edit
Ambari’s
> files,
> > > it
> > > > > will
> > > > > > > > Error
> > > > > > > > > out if the set of parameters or parameter names
changes.
> The
> > > > > > > historical
> > > > > > > > > information about configuration changes is also
stored in
> the
> > > db.
> > > > > > > > > For each component (and in the case of Metron,
for each
> > > > topology),
> > > > > > > there
> > > > > > > > > is a python file which controls the logic for
these
> actions,
> > > > among
> > > > > > > > others:
> > > > > > > > > - Install
> > > > > > > > > - Start / stop / restart / status
> > > > > > > > > - Configure
> > > > > > > > >
> > > > > > > > > It is actually up to this python code (which
we wrote for
> the
> > > > > Metron
> > > > > > > > > Mpack) what happens in each of these API calls.
 But the
> > > current
> > > > > > code,
> > > > > > > > and
> > > > > > > > > I believe this is typical of Ambari-managed components,
> > > performs
> > > > a
> > > > > > > > > “Configure” action whenever you press the
“Save” button
> after
> > > > > > changing
> > > > > > > a
> > > > > > > > > component config in Ambari, and also on each
Install and
> > Start
> > > or
> > > > > > > > Restart.
> > > > > > > > >
> > > > > > > > > The Configure action consists of approximately
the
> following
> > > > > sequence
> > > > > > > > (see
> > > > > > > > > disclaimer above :-)
> > > > > > > > > - Recreate the generated config files, using
the template
> > files
> > > > and
> > > > > > the
> > > > > > > > > actual configuration most recently set in Ambari
> > > > > > > > > o Note this is also under the control of python
code that
> we
> > > > wrote,
> > > > > > and
> > > > > > > > > this is the appropriate place to push to ZK if
desired.
> > > > > > > > > - Propagate those config files to each Ambari-agent,
with a
> > > > command
> > > > > > to
> > > > > > > > set
> > > > > > > > > them locally
> > > > > > > > > - The ambari-agents on each node receive the
files and
> write
> > > them
> > > > > to
> > > > > > > the
> > > > > > > > > specified locations on local storage
> > > > > > > > >
> > > > > > > > > Ambari-server then whines that the updated services
should
> be
> > > > > > > restarted,
> > > > > > > > > but does not initiate that action itself (unless
of course
> > the
> > > > > > > initiating
> > > > > > > > > action was a Start command from the administrator).
> > > > > > > > >
> > > > > > > > > Make sense?  It’s all quite straightforward
in concept,
> > there’s
> > > > > just
> > > > > > an
> > > > > > > > > awful lot of stuff wrapped around that to make
it all go
> > > smoothly
> > > > > and
> > > > > > > > > handle the problems when it doesn’t.
> > > > > > > > >
> > > > > > > > > There’s additional complexity in that the Ambari-agent
also
> > > > caches
> > > > > > (on
> > > > > > > > > each node) both the template files and COMPILED
forms of
> the
> > > > python
> > > > > > > files
> > > > > > > > > (.pyc) involved in transforming them.  The pyc
files
> > > incorporate
> > > > > some
> > > > > > > > > amount of additional info regarding parameter
values, but
> I’m
> > > not
> > > > > > sure
> > > > > > > of
> > > > > > > > > the form.  I don’t think that changes the above
in any
> > > practical
> > > > > way
> > > > > > > > unless
> > > > > > > > > you’re trying to cheat Ambari by reaching in
and editing
> its
> > > > files
> > > > > > > > > directly.  In that case, you also need to whack
the pyc
> files
> > > (on
> > > > > > each
> > > > > > > > > node) to force the data to be reloaded from Ambari-server.
> > > Best
> > > > > > > solution
> > > > > > > > > is don’t cheat.
> > > > > > > > >
> > > > > > > > > Also, there may be circumstances under which
the
> Ambari-agent
> > > > will
> > > > > > > detect
> > > > > > > > > changes and re-write the latest version it knows
of the
> > config
> > > > > files,
> > > > > > > > even
> > > > > > > > > without a Save or Start action at the Ambari-server.
 I’m
> not
> > > > sure
> > > > > of
> > > > > > > > this
> > > > > > > > > and need to check with Ambari developers.  It
may no longer
> > > > happen,
> > > > > > > altho
> > > > > > > > > I’m pretty sure change detection/reversion
was a feature of
> > > early
> > > > > > > > versions
> > > > > > > > > of Ambari.
> > > > > > > > >
> > > > > > > > > Hope this helps,
> > > > > > > > > --Matt
> > > > > > > > >
> > > > > > > > > ================================================
> > > > > > > > > From: Michael Miklavcic <michael.miklavcic@gmail.com>
> > > > > > > > > Reply-To: "dev@metron.incubator.apache.org"
> > > > > > > > <dev@metron.incubator.apache.
> > > > > > > > > org>
> > > > > > > > > Date: Thursday, January 12, 2017 at 3:59 PM
> > > > > > > > > To: "dev@metron.incubator.apache.org"
> > > > > <dev@metron.incubator.apache.
> > > > > > org
> > > > > > > >
> > > > > > > > > Subject: Re: [DISCUSS] Ambari Metron Configuration
> Management
> > > > > > > > consequences
> > > > > > > > > and call to action
> > > > > > > > >
> > > > > > > > > Hi Casey,
> > > > > > > > >
> > > > > > > > > Thanks for starting this thread. I believe you
are correct
> in
> > > > your
> > > > > > > > > assessment of the 4 options for updating configs
in Metron.
> > > When
> > > > > > using
> > > > > > > > more
> > > > > > > > > than one of these options we can get into a split-brain
> > > > scenario. A
> > > > > > > basic
> > > > > > > > > example is updating the global config on disk
and using the
> > > > > > > > > zk_load_configs.sh. Later, if a user decides
to restart
> > Ambari,
> > > > the
> > > > > > > > cached
> > > > > > > > > version stored by Ambari (it's in the MySQL or
other
> database
> > > > > backing
> > > > > > > > > Ambari) will be written out to disk in the defined
config
> > > > > directory,
> > > > > > > and
> > > > > > > > > subsequently loaded using the zk_load_configs.sh
under the
> > > hood.
> > > > > Any
> > > > > > > > global
> > > > > > > > > configuration modified outside of Ambari will
be lost at
> this
> > > > > point.
> > > > > > > This
> > > > > > > > > is obviously undesirable, but I also like the
purpose and
> > > utility
> > > > > > > exposed
> > > > > > > > > by the multiple config management interfaces
we currently
> > have
> > > > > > > > available. I
> > > > > > > > > also agree that a service would be best.
> > > > > > > > >
> > > > > > > > > For reference, here's my understanding of the
current
> > > > configuration
> > > > > > > > > loading mechanisms and their deps.
> > > > > > > > >
> > > > > > > > > <image>
> > > > > > > > >
> > > > > > > > > Mike
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, Jan 12, 2017 at 3:08 PM, Casey Stella
<
> > > > cestella@gmail.com>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > In the course of discussion on the PR for METRON-652
> > > > > > > > > <https://github.com/apache/incubator-metron/pull/415>
> > > something
> > > > > > that I
> > > > > > > > > should definitely have understood better came
to light and
> I
> > > > > thought
> > > > > > > that
> > > > > > > > > it was worth bringing to the attention of the
community to
> > get
> > > > > > > > > clarification/discuss is just how we manage configs.
> > > > > > > > >
> > > > > > > > > Currently (assuming the management UI that Ryan
Merriman
> > > > submitted)
> > > > > > > > configs
> > > > > > > > > are managed/adjusted via a couple of different
mechanism.
> > > > > > > > >
> > > > > > > > >    - zk_load_utils.sh: pushed and pulled from
disk to
> > zookeeper
> > > > > > > > >    - Stellar REPL: pushed and pulled via the
> > > > CONFIG_GET/CONFIG_PUT
> > > > > > > > > functions
> > > > > > > > >    - Ambari: initialized via the zk_load_utils
script and
> > then
> > > > some
> > > > > > of
> > > > > > > > them
> > > > > > > > >    are managed directly (global config) and some
indirectly
> > > > > > > > > (sensor-specific
> > > > > > > > >    configs).
> > > > > > > > >       - NOTE: Upon service restart, it may or
may not
> > overwrite
> > > > > > changes
> > > > > > > > on
> > > > > > > > >       disk or on zookeeper.  *Can someone more
> knowledgeable
> > > than
> > > > > me
> > > > > > > > about
> > > > > > > > >       this describe precisely the semantics that
we can
> > expect
> > > on
> > > > > > > > > service restart
> > > > > > > > >       for Ambari? What gets overwritten on disk
and what
> gets
> > > > > updated
> > > > > > > > > in ambari?*
> > > > > > > > >    - The Management UI: manages some of the configs.
*RYAN:
> > > Which
> > > > > > > configs
> > > > > > > > >    do we support here and which don't we support
here?*
> > > > > > > > >
> > > > > > > > > As you can see, we have a mishmash of mechanisms
to update
> > and
> > > > > manage
> > > > > > > the
> > > > > > > > > configuration for Metron in zookeeper.  In the
beginning
> the
> > > > > approach
> > > > > > > was
> > > > > > > > > just to edit configs on disk and push/pull them
via
> > > > zk_load_utils.
> > > > > > > > Configs
> > > > > > > > > could be historically managed using source control,
etc.
> As
> > we
> > > > got
> > > > > > > more
> > > > > > > > > and more components managing the configs, we
haven't taken
> > care
> > > > > that
> > > > > > > they
> > > > > > > > > they all work with each other in an expected
way (I believe
> > > these
> > > > > are
> > > > > > > > > true..correct me if I'm wrong):
> > > > > > > > >
> > > > > > > > >    - If configs are modified in the management
UI or the
> > > Stellar
> > > > > REPL
> > > > > > > and
> > > > > > > > >    someone forgets to pull the configs from zookeeper
to
> > disk,
> > > > > before
> > > > > > > > they
> > > > > > > > > do
> > > > > > > > >    a push via zk_load_utils, they will clobber
the configs
> in
> > > > > > zookeeper
> > > > > > > > > with
> > > > > > > > >    old configs.
> > > > > > > > >    - If the global config is changed on disk
and the ambari
> > > > service
> > > > > > > > >    restarts, it'll get reset with the original
global
> config.
> > > > > > > > >    - *Ryan, in the management UI, if someone
changes the
> > > > zookeeper
> > > > > > > > configs
> > > > > > > > >    from outside, are those configs reflected
immediately in
> > the
> > > > > UI?*
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > It seems to me that we have a couple of options
here:
> > > > > > > > >
> > > > > > > > >    - A service to intermediate and handle config
> > > update/retrieval
> > > > > and
> > > > > > > > >    tracking historical changes so these different
> mechanisms
> > > can
> > > > > use
> > > > > > a
> > > > > > > > > common
> > > > > > > > >    component for config management/tracking and
refactor
> the
> > > > > existing
> > > > > > > > >    mechanisms to use that service
> > > > > > > > >    - Standardize on exactly one component to
manage the
> > configs
> > > > and
> > > > > > > > regress
> > > > > > > > >    the others (that's a verb, right?   nicer
than delete.)
> > > > > > > > >
> > > > > > > > > I happen to like the service approach, myself,
but I wanted
> > to
> > > > put
> > > > > it
> > > > > > > up
> > > > > > > > > for discussion and hopefully someone will volunteer
to
> design
> > > > such
> > > > > a
> > > > > > > > thing.
> > > > > > > > >
> > > > > > > > > To frame the debate, I want us to keep in mind
a couple of
> > > things
> > > > > > that
> > > > > > > > may
> > > > > > > > > or may not be relevant to the discussion:
> > > > > > > > >
> > > > > > > > >    - We will eventually be moving to support
kerberos so
> > there
> > > > > should
> > > > > > > at
> > > > > > > > >    least be a path to use kerberos for any solution
IMO
> > > > > > > > >    - There is value in each of the different
mechanisms in
> > > place
> > > > > now.
> > > > > > > If
> > > > > > > > >    there weren't, then they wouldn't have been
created.
> > Before
> > > > we
> > > > > > try
> > > > > > > to
> > > > > > > > > make
> > > > > > > > >    this a "there can be only one" argument, I'd
like to
> hear
> > > very
> > > > > > good
> > > > > > > > >    arguments.
> > > > > > > > >
> > > > > > > > > Finally, I'd appreciate if some people might
answer the
> > > > questions I
> > > > > > > have
> > > > > > > > in
> > > > > > > > > bold there.  Hopefully this discussion, if nothing
else
> > > happens,
> > > > > will
> > > > > > > > > result in fodder for proper documentation of
the ins and
> outs
> > > of
> > > > > each
> > > > > > > of
> > > > > > > > > the components bulleted above.
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > >
> > > > > > > > > Casey
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Nick Allen <nick@nickallen.org>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Nick Allen <nick@nickallen.org>
> > >
> >
>
>
>
> --
> Nick Allen <nick@nickallen.org>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message