metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Carolyn Duby <cd...@hortonworks.com>
Subject Re: [DISCUSS] Ambari Metron Configuration Management consequences and call to action
Date Fri, 13 Jan 2017 15:19:18 GMT
ZooKeeper is more efficient if you want to maintain an update to the topologies without requiring
a restart.  Is this useful going forward?   I think it is for development but production environments
you would generally only be updating during a maintenance window so requiring a restart is
not horrible.  

Outside of configuration sharing, ZooKeeper is essential for coordinating clustered solutions.
 For example leader election in an HA cluster or for distributing worker assignments.

Thanks
Carolyn



On 1/13/17, 10:14 AM, "Casey Stella" <cestella@gmail.com> wrote:

>Polling the Ambari server via REST (or their API if they have one), would
>entail all workers hitting one server and create a single point of failure
>(the ambari server is what serves up REST).  Zookeeper's intent is to not
>have a single point of failure like this and (one of its main) use-cases is
>to serve up configs in a distributed environment.
>
>Casey
>
>On Fri, Jan 13, 2017 at 9:55 AM, Nick Allen <nick@nickallen.org> wrote:
>
>> Let me ask a stupid question.  What does Zookeeper do for us that Ambari
>> cannot?  Why keep Zookeeper in the mix?
>>
>>
>>
>> On Fri, Jan 13, 2017 at 9:28 AM, David Lyle <dlyle65535@gmail.com> wrote:
>>
>> > In the main yes- I've made some changes:
>> >
>> >  - Expand ambari to manage the remaining sensor-specific configs
>> >  - Refactor the push calls to zookeeper (in ConfigurationUtils, I think)
>> >    to push to ambari and take an Ambari user/pw and (optionally) reason
>> >  - (Ambari can push to zookeeper, but it requires a service restart, so
>> for
>> > "live changes" you may
>> >     want do both a rest call and zookeeper update from
>> ConfigurationUtils)
>> >     WAS
>> >     Question remains about whether ambari can do the push to zookeeper
>> >     or whetheror whether ConfigurationUtils has to push to zookeeper as
>> > well as update
>> >     ambari.
>> >   - Refactor the middleware that Ryan submitted to have the API calls
>> take
>> >      an Ambari user/pw and (optionally) reason
>> >   - Refactor the management UI to pass in an Ambari user/pw and
>> > (optionally) reason
>> >   - Refactor the Stellar Management functions CONFIG_PUT to accept an
>> > Ambari user/pw and (optionally) reason
>> >
>> > I think we'd need to do some detailed design around how to handle what we
>> > expect to be dynamic configs, but the main principle should (imo) be to
>> > always know who and why and make sure that Ambari is aware and is the
>> > static backing store for Zookeeper.
>> >
>> > -D...
>> >
>> >
>> > On Fri, Jan 13, 2017 at 9:19 AM, Casey Stella <cestella@gmail.com>
>> wrote:
>> >
>> > > So, basically, your proposed changes, broken into tangible gobbets of
>> > work:
>> > >
>> > >    - Expand ambari to manage the remaining sensor-specific configs
>> > >    - Refactor the push calls to zookeeper (in ConfigurationUtils, I
>> > think)
>> > >    to push to ambari and take a reason
>> > >       - Question remains about whether ambari can do the push to
>> > zookeeper
>> > >       or whether ConfigurationUtils has to push to zookeeper as well as
>> > > update
>> > >       ambari.
>> > >    - Refactor the middleware that Ryan submitted to have the API calls
>> > take
>> > >    a reason
>> > >    - Refactor the management UI to pass in a reason
>> > >    - Refactor the Stellar Management functions CONFIG_PUT to accept a
>> > > reason
>> > >
>> > > Just so we can evaluate it and I can ensure I haven't overlooked some
>> > > important point.  Please tell me if Ambari cannot do the things we're
>> > > suggesting it can do.
>> > >
>> > > Casey
>> > >
>> > > On Fri, Jan 13, 2017 at 9:15 AM, David Lyle <dlyle65535@gmail.com>
>> > wrote:
>> > >
>> > > > That's exactly correct, Casey. Basically, an expansion of what we're
>> > > > currently doing with global.json, enrichment.properties and
>> > > > elasticsearch.properties.
>> > > >
>> > > > -D...
>> > > >
>> > > >
>> > > > On Fri, Jan 13, 2017 at 9:12 AM, Casey Stella <cestella@gmail.com>
>> > > wrote:
>> > > >
>> > > > > I would suggest not having Ambari replace zookeeper.  I think
the
>> > > > proposal
>> > > > > is to have Ambari replace the editable store (like the JSON files
>> on
>> > > > > disk).  Zookeeper woudl be the source of truth for the running
>> > > topologies
>> > > > > and ambari would be sync'd to it.
>> > > > >
>> > > > > Correct if I misspeak, dave or matt.
>> > > > >
>> > > > > Casey
>> > > > >
>> > > > > On Fri, Jan 13, 2017 at 9:09 AM, Nick Allen <nick@nickallen.org>
>> > > wrote:
>> > > > >
>> > > > > > Ambari seems like a logical choice.
>> > > > > >
>> > > > > > *>> It doesn’t natively integrate Zookeeper storage
of configs,
>> but
>> > > > there
>> > > > > > is a natural place to specify copy to/from Zookeeper for
the
>> files
>> > > > > > desired.*
>> > > > > >
>> > > > > > How would Ambari interact with Zookeeper in this scenario?
 Would
>> > > > Ambari
>> > > > > > replace Zookeeper completely? Or would Zookeeper act as
the
>> > > persistence
>> > > > > > tier under Ambari?
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > On Thu, Jan 12, 2017 at 9:24 PM, Matt Foley <mattf@apache.org>
>> > > wrote:
>> > > > > >
>> > > > > > > Mike, could you try again on the image, please, making
sure it
>> > is a
>> > > > > > simple
>> > > > > > > format (gif, png, or jpeg)?  It got munched, at least
in my
>> > viewer.
>> > > > > > Thanks.
>> > > > > > >
>> > > > > > > Casey, responding to some of the questions you raised:
>> > > > > > >
>> > > > > > > I’m going to make a rather strong statement:  We
already have a
>> > > > service
>> > > > > > > “to intermediate and handle config update/retrieval”.
>> > > > > > > Furthermore, it:
>> > > > > > > - Correctly handles the problems of distributed services
>> running
>> > on
>> > > > > > > multi-node clusters.  (That’s a HARD problem, people,
and we
>> > > > shouldn’t
>> > > > > > try
>> > > > > > > to reinvent the wheel.)
>> > > > > > > - Correctly handles Kerberos security. (That’s kinda
hard too,
>> or
>> > > at
>> > > > > > least
>> > > > > > > a lot of work.)
>> > > > > > > - It does automatic versioning of configurations, and
allows
>> > > viewing,
>> > > > > > > comparing, and reverting historical configs
>> > > > > > > - It has a capable REST API for all those things.
>> > > > > > > It doesn’t natively integrate Zookeeper storage of
configs, but
>> > > there
>> > > > > is
>> > > > > > a
>> > > > > > > natural place to specify copy to/from Zookeeper for
the files
>> > > > desired.
>> > > > > > >
>> > > > > > > It is Ambari.  And we should commit to it, rather than
try to
>> > > > re-create
>> > > > > > > such features.
>> > > > > > > Because it has a good REST API, it is perfectly feasible
to
>> > > implement
>> > > > > > > Stellar functions that call it.
>> > > > > > > GUI configuration tools can also use the Ambari APIs,
or better
>> > yet
>> > > > be
>> > > > > > > integrated in an “Ambari View”. (Eg, see the “Yarn
Capacity
>> > > Scheduler
>> > > > > > > Configuration Tool” example in the Ambari documentation,
under
>> > > “Using
>> > > > > > > Ambari Views”.)
>> > > > > > >
>> > > > > > > Arguments are: Parsimony, Sufficiency, Not reinventing
the
>> wheel,
>> > > and
>> > > > > Not
>> > > > > > > spending weeks and weeks of developer time over the
next year
>> > > > > reinventing
>> > > > > > > the wheel while getting details wrong multiple times…
>> > > > > > >
>> > > > > > > Okay, off soapbox.
>> > > > > > >
>> > > > > > > Casey asked what the config update behavior of Ambari
is, and
>> how
>> > > it
>> > > > > will
>> > > > > > > interact with changes made from outside Ambari.
>> > > > > > > The following is from my experience working with the
Ambari
>> Mpack
>> > > for
>> > > > > > > Metron.  I am not otherwise an Ambari expert, so tomorrow
I’ll
>> > get
>> > > it
>> > > > > > > reviewed by an Ambari development engineer.
>> > > > > > >
>> > > > > > > Ambari-server runs on one node, and Ambari-agent runs
on each
>> of
>> > > all
>> > > > > the
>> > > > > > > nodes.
>> > > > > > > Ambari-server has a private set of py, xml, and template
files,
>> > > which
>> > > > > > > together are used both to generate the Ambari configuration
>> GUI,
>> > > with
>> > > > > > > defaults, and to generate configuration files (of any
needed
>> > > > filetype)
>> > > > > > for
>> > > > > > > the various Stack components.
>> > > > > > > Ambari-server also has a database where it stores the
schema
>> > > related
>> > > > to
>> > > > > > > these files, so even if you reach in and edit Ambari’s
files,
>> it
>> > > will
>> > > > > > Error
>> > > > > > > out if the set of parameters or parameter names changes.
 The
>> > > > > historical
>> > > > > > > information about configuration changes is also stored
in the
>> db.
>> > > > > > > For each component (and in the case of Metron, for
each
>> > topology),
>> > > > > there
>> > > > > > > is a python file which controls the logic for these
actions,
>> > among
>> > > > > > others:
>> > > > > > > - Install
>> > > > > > > - Start / stop / restart / status
>> > > > > > > - Configure
>> > > > > > >
>> > > > > > > It is actually up to this python code (which we wrote
for the
>> > > Metron
>> > > > > > > Mpack) what happens in each of these API calls.  But
the
>> current
>> > > > code,
>> > > > > > and
>> > > > > > > I believe this is typical of Ambari-managed components,
>> performs
>> > a
>> > > > > > > “Configure” action whenever you press the “Save”
button after
>> > > > changing
>> > > > > a
>> > > > > > > component config in Ambari, and also on each Install
and Start
>> or
>> > > > > > Restart.
>> > > > > > >
>> > > > > > > The Configure action consists of approximately the
following
>> > > sequence
>> > > > > > (see
>> > > > > > > disclaimer above :-)
>> > > > > > > - Recreate the generated config files, using the template
files
>> > and
>> > > > the
>> > > > > > > actual configuration most recently set in Ambari
>> > > > > > > o Note this is also under the control of python code
that we
>> > wrote,
>> > > > and
>> > > > > > > this is the appropriate place to push to ZK if desired.
>> > > > > > > - Propagate those config files to each Ambari-agent,
with a
>> > command
>> > > > to
>> > > > > > set
>> > > > > > > them locally
>> > > > > > > - The ambari-agents on each node receive the files
and write
>> them
>> > > to
>> > > > > the
>> > > > > > > specified locations on local storage
>> > > > > > >
>> > > > > > > Ambari-server then whines that the updated services
should be
>> > > > > restarted,
>> > > > > > > but does not initiate that action itself (unless of
course the
>> > > > > initiating
>> > > > > > > action was a Start command from the administrator).
>> > > > > > >
>> > > > > > > Make sense?  It’s all quite straightforward in concept,
there’s
>> > > just
>> > > > an
>> > > > > > > awful lot of stuff wrapped around that to make it all
go
>> smoothly
>> > > and
>> > > > > > > handle the problems when it doesn’t.
>> > > > > > >
>> > > > > > > There’s additional complexity in that the Ambari-agent
also
>> > caches
>> > > > (on
>> > > > > > > each node) both the template files and COMPILED forms
of the
>> > python
>> > > > > files
>> > > > > > > (.pyc) involved in transforming them.  The pyc files
>> incorporate
>> > > some
>> > > > > > > amount of additional info regarding parameter values,
but I’m
>> not
>> > > > sure
>> > > > > of
>> > > > > > > the form.  I don’t think that changes the above in
any
>> practical
>> > > way
>> > > > > > unless
>> > > > > > > you’re trying to cheat Ambari by reaching in and
editing its
>> > files
>> > > > > > > directly.  In that case, you also need to whack the
pyc files
>> (on
>> > > > each
>> > > > > > > node) to force the data to be reloaded from Ambari-server.
>> Best
>> > > > > solution
>> > > > > > > is don’t cheat.
>> > > > > > >
>> > > > > > > Also, there may be circumstances under which the Ambari-agent
>> > will
>> > > > > detect
>> > > > > > > changes and re-write the latest version it knows of
the config
>> > > files,
>> > > > > > even
>> > > > > > > without a Save or Start action at the Ambari-server.
 I’m not
>> > sure
>> > > of
>> > > > > > this
>> > > > > > > and need to check with Ambari developers.  It may no
longer
>> > happen,
>> > > > > altho
>> > > > > > > I’m pretty sure change detection/reversion was a
feature of
>> early
>> > > > > > versions
>> > > > > > > of Ambari.
>> > > > > > >
>> > > > > > > Hope this helps,
>> > > > > > > --Matt
>> > > > > > >
>> > > > > > > ================================================
>> > > > > > > From: Michael Miklavcic <michael.miklavcic@gmail.com>
>> > > > > > > Reply-To: "dev@metron.incubator.apache.org"
>> > > > > > <dev@metron.incubator.apache.
>> > > > > > > org>
>> > > > > > > Date: Thursday, January 12, 2017 at 3:59 PM
>> > > > > > > To: "dev@metron.incubator.apache.org"
>> > > <dev@metron.incubator.apache.
>> > > > org
>> > > > > >
>> > > > > > > Subject: Re: [DISCUSS] Ambari Metron Configuration
Management
>> > > > > > consequences
>> > > > > > > and call to action
>> > > > > > >
>> > > > > > > Hi Casey,
>> > > > > > >
>> > > > > > > Thanks for starting this thread. I believe you are
correct in
>> > your
>> > > > > > > assessment of the 4 options for updating configs in
Metron.
>> When
>> > > > using
>> > > > > > more
>> > > > > > > than one of these options we can get into a split-brain
>> > scenario. A
>> > > > > basic
>> > > > > > > example is updating the global config on disk and using
the
>> > > > > > > zk_load_configs.sh. Later, if a user decides to restart
Ambari,
>> > the
>> > > > > > cached
>> > > > > > > version stored by Ambari (it's in the MySQL or other
database
>> > > backing
>> > > > > > > Ambari) will be written out to disk in the defined
config
>> > > directory,
>> > > > > and
>> > > > > > > subsequently loaded using the zk_load_configs.sh under
the
>> hood.
>> > > Any
>> > > > > > global
>> > > > > > > configuration modified outside of Ambari will be lost
at this
>> > > point.
>> > > > > This
>> > > > > > > is obviously undesirable, but I also like the purpose
and
>> utility
>> > > > > exposed
>> > > > > > > by the multiple config management interfaces we currently
have
>> > > > > > available. I
>> > > > > > > also agree that a service would be best.
>> > > > > > >
>> > > > > > > For reference, here's my understanding of the current
>> > configuration
>> > > > > > > loading mechanisms and their deps.
>> > > > > > >
>> > > > > > > <image>
>> > > > > > >
>> > > > > > > Mike
>> > > > > > >
>> > > > > > >
>> > > > > > > On Thu, Jan 12, 2017 at 3:08 PM, Casey Stella <
>> > cestella@gmail.com>
>> > > > > > wrote:
>> > > > > > >
>> > > > > > > In the course of discussion on the PR for METRON-652
>> > > > > > > <https://github.com/apache/incubator-metron/pull/415>
>> something
>> > > > that I
>> > > > > > > should definitely have understood better came to light
and I
>> > > thought
>> > > > > that
>> > > > > > > it was worth bringing to the attention of the community
to get
>> > > > > > > clarification/discuss is just how we manage configs.
>> > > > > > >
>> > > > > > > Currently (assuming the management UI that Ryan Merriman
>> > submitted)
>> > > > > > configs
>> > > > > > > are managed/adjusted via a couple of different mechanism.
>> > > > > > >
>> > > > > > >    - zk_load_utils.sh: pushed and pulled from disk
to zookeeper
>> > > > > > >    - Stellar REPL: pushed and pulled via the
>> > CONFIG_GET/CONFIG_PUT
>> > > > > > > functions
>> > > > > > >    - Ambari: initialized via the zk_load_utils script
and then
>> > some
>> > > > of
>> > > > > > them
>> > > > > > >    are managed directly (global config) and some indirectly
>> > > > > > > (sensor-specific
>> > > > > > >    configs).
>> > > > > > >       - NOTE: Upon service restart, it may or may not
overwrite
>> > > > changes
>> > > > > > on
>> > > > > > >       disk or on zookeeper.  *Can someone more knowledgeable
>> than
>> > > me
>> > > > > > about
>> > > > > > >       this describe precisely the semantics that we
can expect
>> on
>> > > > > > > service restart
>> > > > > > >       for Ambari? What gets overwritten on disk and
what gets
>> > > updated
>> > > > > > > in ambari?*
>> > > > > > >    - The Management UI: manages some of the configs.
*RYAN:
>> Which
>> > > > > configs
>> > > > > > >    do we support here and which don't we support here?*
>> > > > > > >
>> > > > > > > As you can see, we have a mishmash of mechanisms to
update and
>> > > manage
>> > > > > the
>> > > > > > > configuration for Metron in zookeeper.  In the beginning
the
>> > > approach
>> > > > > was
>> > > > > > > just to edit configs on disk and push/pull them via
>> > zk_load_utils.
>> > > > > > Configs
>> > > > > > > could be historically managed using source control,
etc.  As we
>> > got
>> > > > > more
>> > > > > > > and more components managing the configs, we haven't
taken care
>> > > that
>> > > > > they
>> > > > > > > they all work with each other in an expected way (I
believe
>> these
>> > > are
>> > > > > > > true..correct me if I'm wrong):
>> > > > > > >
>> > > > > > >    - If configs are modified in the management UI or
the
>> Stellar
>> > > REPL
>> > > > > and
>> > > > > > >    someone forgets to pull the configs from zookeeper
to disk,
>> > > before
>> > > > > > they
>> > > > > > > do
>> > > > > > >    a push via zk_load_utils, they will clobber the
configs in
>> > > > zookeeper
>> > > > > > > with
>> > > > > > >    old configs.
>> > > > > > >    - If the global config is changed on disk and the
ambari
>> > service
>> > > > > > >    restarts, it'll get reset with the original global
config.
>> > > > > > >    - *Ryan, in the management UI, if someone changes
the
>> > zookeeper
>> > > > > > configs
>> > > > > > >    from outside, are those configs reflected immediately
in the
>> > > UI?*
>> > > > > > >
>> > > > > > >
>> > > > > > > It seems to me that we have a couple of options here:
>> > > > > > >
>> > > > > > >    - A service to intermediate and handle config
>> update/retrieval
>> > > and
>> > > > > > >    tracking historical changes so these different mechanisms
>> can
>> > > use
>> > > > a
>> > > > > > > common
>> > > > > > >    component for config management/tracking and refactor
the
>> > > existing
>> > > > > > >    mechanisms to use that service
>> > > > > > >    - Standardize on exactly one component to manage
the configs
>> > and
>> > > > > > regress
>> > > > > > >    the others (that's a verb, right?   nicer than delete.)
>> > > > > > >
>> > > > > > > I happen to like the service approach, myself, but
I wanted to
>> > put
>> > > it
>> > > > > up
>> > > > > > > for discussion and hopefully someone will volunteer
to design
>> > such
>> > > a
>> > > > > > thing.
>> > > > > > >
>> > > > > > > To frame the debate, I want us to keep in mind a couple
of
>> things
>> > > > that
>> > > > > > may
>> > > > > > > or may not be relevant to the discussion:
>> > > > > > >
>> > > > > > >    - We will eventually be moving to support kerberos
so there
>> > > should
>> > > > > at
>> > > > > > >    least be a path to use kerberos for any solution
IMO
>> > > > > > >    - There is value in each of the different mechanisms
in
>> place
>> > > now.
>> > > > > If
>> > > > > > >    there weren't, then they wouldn't have been created.
 Before
>> > we
>> > > > try
>> > > > > to
>> > > > > > > make
>> > > > > > >    this a "there can be only one" argument, I'd like
to hear
>> very
>> > > > good
>> > > > > > >    arguments.
>> > > > > > >
>> > > > > > > Finally, I'd appreciate if some people might answer
the
>> > questions I
>> > > > > have
>> > > > > > in
>> > > > > > > bold there.  Hopefully this discussion, if nothing
else
>> happens,
>> > > will
>> > > > > > > result in fodder for proper documentation of the ins
and outs
>> of
>> > > each
>> > > > > of
>> > > > > > > the components bulleted above.
>> > > > > > >
>> > > > > > > Best,
>> > > > > > >
>> > > > > > > Casey
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > > > --
>> > > > > > Nick Allen <nick@nickallen.org>
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>>
>>
>> --
>> Nick Allen <nick@nickallen.org>
>>
Mime
View raw message