metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Casey Stella <ceste...@gmail.com>
Subject Re: [DISCUSS] Persisting user data
Date Thu, 03 Aug 2017 09:24:20 GMT
I'd vote for a DB-based solution, but I'd argue that any solution shouldn't
be database specific (i.e. postgres), but JDBC-generic.  People and
organizations have very strong views regarding databases and I'd prefer to
side-step those holy wars by being agnostic.

On Wed, Aug 2, 2017 at 9:36 PM, Ryan Merriman <merrimanr@gmail.com> wrote:

> Spring supports a variety of databases including Postgres.  I have no
> problem with using Postgres instead of MySQL.
>
> On Wed, Aug 2, 2017 at 3:32 PM, Simon Elliston Ball <
> simon@simonellistonball.com> wrote:
>
> > Agreed on Postgres. It's a lot easier to work with license-wise in apache
> > projects, and has a lot of the capability we need here, especially if we
> > can find a sensible ORM. Anyone got any thoughts on what would work
> there?
> >
> > Simon
> >
> > > On 2 Aug 2017, at 21:21, Matt Foley <mattf@apache.org> wrote:
> > >
> > > Hi Ryan,
> > > Zookeeper has a default (and seldom changed) max znode size of 1MB, but
> > it is “designed to store data on the order of kilobytes in size.”[1]  And
> > it’s not really intended for frequently-changing data, which is okay
> here.
> > But I just included it for completeness, I’m not advocating for its use
> > here.
> > >
> > > I agree with you that the problem, especially because it includes
> shared
> > config, would fit well in a db.  I’d suggest you consider PostgreSQL
> rather
> > than MySQL, as postgres is built into Redhat 6 and 7, and Ambari now uses
> > it by default, so an available server might be conveniently at hand in
> most
> > deployments.  Definitely assume the user will want to use an external db
> > instance, rather than one dedicated to this use.  Conveniently Postgres
> > also has a native REST interface, with the usual authorization options.
> > >
> > > Never mind about Ambari Views for now.  It’s just a way to get GUI
> > dashboards without writing all the infrastructure for it, which as you
> say
> > is somewhat water under the bridge.
> > > Cheers,
> > > --Matt
> > >
> > > [1] https://zookeeper.apache.org/doc/r3.1.2/zookeeperAdmin.html
> > >
> > >
> > >
> > > On 8/2/17, 12:34 PM, "Ryan Merriman" <merrimanr@gmail.com> wrote:
> > >
> > >    Matt,
> > >
> > >    Thank you for the suggestions.  I forgot to include Zookeeper.  Are
> > there
> > >    any tradeoffs we should be aware of if we decide to use Zookeeper?
> > Are
> > >    there guidelines for how much data can be stored in Zookeeper?
> > >
> > >    To answer your questions:
> > >
> > >    1.  I think both use cases make sense so a combination of shared and
> > >    personal.
> > >    2.  I was planning on managing authorization in the REST layer.  For
> > now
> > >    viewer login auth (which is really REST auth) will suffice but we
> > might
> > >    consider other methods since authentication is pluggable here.
> > >    3.  I had not considered Ambari Views since this will support an
> > existing
> > >    UI.  How would Ambari Views help us here?
> > >
> > >    I will proceed initially with a saved search POC using a relational
> > >    database unless you think that is a bad idea or there are other
> better
> > >    options.  Hopefully an example will further the discussion.
> > >
> > >    Ryan
> > >
> > >>    On Wed, Jul 26, 2017 at 6:31 PM, Matt Foley <mattf@apache.org>
> > wrote:
> > >>
> > >> There’s a couple other places you could put config info (but maybe not
> > >> saved searches):
> > >> -  Zookeeper
> > >> -  metron-alerts-ui/config.xml or config.json  file
> > >> -  the Ambari database, whichever it happens to be
> > >>
> > >> Questions that influence the decision include:
> > >> 1. Should there be one configuration shared among users, or strictly
> > >> per-user config?  Or a combination of shared and personal?
> > >> 2. What security do you wish to maintain on changing those settings,
> > both
> > >> shared and personal?  What authentication/authorization scheme will
> you
> > >> use?  Is viewer login auth sufficient for this?
> > >> 3. Will you assume Ambari exists?  Did you consider using Ambari Views
> > as
> > >> the basis? (https://cwiki.apache.org/confluence/display/AMBARI/Views
> )
> > >>
> > >> On 7/26/17, 2:54 PM, "Ryan Merriman" <merrimanr@gmail.com> wrote:
> > >>
> > >>    In anticipation of METRON-988 being merged into master, there will
> > be a
> > >>    need to persist user preferences such as UI layout, saved searches,
> > >> search
> > >>    history, etc.  I think where and how we persist this data should be
> > >>    discussed in order to facilitate a design.  This data won't be
> large
> > in
> > >>    scale and may or may not be relational.  The initial features I am
> > >> aware of
> > >>    don't require a relational model but I'm sure there will be some
> that
> > >> do in
> > >>    the future.  I'm also assuming this code will live in the REST
> > >> application
> > >>    but someone correct me if there is a reason to keep it somewhere
> > else.
> > >>
> > >>    I think it would be preferable to leverage something that is
> already
> > >> in our
> > >>    stack and available as a dependency.  However I would not be
> against
> > >> adding
> > >>    something if it really were the right tool for the job.  Assuming
> > >> others
> > >>    agree we should stick with out current stack, I see these options:
> > >>
> > >>       - MySQL (or other relational database)
> > >>          - good fit for the size of data
> > >>          - relational capabilities
> > >>          - an ORM framework will be necessary which will increase our
> > >>          dependencies and complexity
> > >>       - HBase
> > >>          - client setup and code will likely be simpler and less
> complex
> > >>          - limited data model
> > >>       - Elasticsearch
> > >>          - json is a convenient data model
> > >>          - we already store user preferences here (Kibana dashboards)
> > >>          - we have abstracted our search engine interactions in
> several
> > >> places
> > >>          and would have to here too
> > >>
> > >>    Elasticsearch is out for me because we view search engines as
> > >> pluggable.  I
> > >>    think HBase would be the easiest to implement and get working but
> I'm
> > >>    worried we'll have similar use cases that won't be a good fit for
> > >> HBase.
> > >>    In that case we would need to come up with an alternative
> persistence
> > >>    solution anyways.  I think MySQL is a good fit long term but I'm
> > >> concerned
> > >>    about adding a heavy ORM framework.  Also, we can't use Hibernate
> > >> because
> > >>    it is not license friendly.
> > >>
> > >>    Does anyone have any thoughts on these options or other ideas?
> > >>
> > >>    This requirement also brings up another topic that is outside of
> this
> > >>    discussion.  Should we reevaluate our authentication strategy?
> > >> Currently
> > >>    the REST application uses JDBC for this but if we decide a
> different
> > >>    mechanism is better then we no longer need a relational database.
> > This
> > >>    might affect our decision to use MySQL for this kind of data
> > >> persistence.
> > >>
> > >>    Ryan
> > >>
> > >>
> > >>
> > >>
> > >
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message