metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zeolla@GMail.com" <zeo...@gmail.com>
Subject Re: [DISCUSS] Opinionated Data Flows
Date Fri, 07 Oct 2016 02:20:35 GMT
In this case I would initially think implicit to simplify the configs.
Doesn't seem overly complicated to implement in my mind, but that doesn't
mean I'm not missing something regarding the current state or future
roadmap.

Jon

On Thu, Oct 6, 2016, 18:25 Matt Foley <mfoley@hortonworks.com> wrote:

> Would splitting and joining be implicit or explicit, for multi-path
> topologies?
> ________________________________________
> From: Zeolla@GMail.com <zeolla@gmail.com>
> Sent: Thursday, October 06, 2016 11:03 AM
> To: dev@metron.incubator.apache.org
> Subject: Re: [DISCUSS] Opinionated Data Flows
>
> It should also be smart enough to handle an order like:
>
> source("bro")
>   -> parser("BasicBroParser")
>   -> exists("ip_src_addr")
>   -> geo_ip_src = geo["ip_src_addr"]
>   -> application = assets["ip_src_addr"].application
>   -> owner = assets["ip_src_addr"].owner
>   -> exists("ip_dst_addr")
>   -> geo_ip_dst = geo["ip_dst_addr"]
>   -> elasticsearch("bro-index")
>
> Without duplicate hits of the topologies.
>
> Jon
>
> On Thu, Oct 6, 2016 at 1:55 PM Nick Allen <nick@nickallen.org> wrote:
>
> > Here is quick example with some hypothetical syntax.  Whatever that
> syntax
> > might be, it would be very simple, easy to understand, and leverage
> > high-level concepts specific to Metron.
> >
> > This flow consumes Bro data, ensures there are valid source/destination
> > IPs, performs geo-enrichment, asset enrichment and finally persists the
> > data in Elasticsearch.
> >
> >
> > source("bro")
> >   -> parser("BasicBroParser")
> >   -> exists("ip_src_addr")
> >   -> exists("ip_dst_addr")
> >   -> geo_ip_src = geo["ip_src_addr"]
> >   -> geo_ip_dst = geo["ip_dst_addr"]
> >   -> application = assets["ip_src_addr"].application
> >   -> owner = assets["ip_src_addr"].owner
> >   -> elasticsearch("bro-index")
> >
> >
> >
> >
> > On Thu, Oct 6, 2016 at 12:58 PM, Nick Allen <nick@nickallen.org> wrote:
> >
> > > Chasing this bad idea down even further leads me to something even
> > > crazier.
> > >
> > > Stellar 1.0 can only operate within a single topology and in most cases
> > > only on a single message.  Stellar 2.0 could be the mechanism that
> allows
> > > users to define their own data flows and what "useful bits of Metron
> > > functionality" get plugged-in.
> > >
> > > Once, you have a DSL that allows users to define what they want Metron
> to
> > > do, then the underlying implementation mechanism (which is currently
> > Storm)
> > > can also be swapped-out.  If we have an even faster Storm
> implementation,
> > > then we swap in the Storm NG engine.  Maybe we want Metron to also run
> in
> > > Flink, then we just swap-in a Flink engine.
> > >
> > >
> > >
> > >
> > > On Thu, Oct 6, 2016 at 12:52 PM, Nick Allen <nick@nickallen.org>
> wrote:
> > >
> > >> I totally "bird dogged the previous thread" as Casey likes to call it.
> > :)
> > >>  I am extracting this thought into a separate thread before I start
> > >> throwing out even more, crazier ideas.
> > >>
> > >> In general, Metron is very opinionated about data flows right now.  We
> > >>> have Parser topologies that feed an Enrichment topology, which then
> > feeds
> > >>> an Indexing topology.  We have useful bits of functionality (think
> > Stellar
> > >>> transforms, Geo enrichment, etc) that are closely coupled with these
> > >>> topologies (aka data flows).
> > >>>
> > >>
> > >>
> > >>> When a user wants to parse heterogenous data from a single topic,
> > that's
> > >>> not easy.  When a user wants enriched output to land in unique topics
> > by
> > >>> sensor type, well, that's also not easy.    When a user wanted to
> skip
> > >>> enrichment of data sources, we actually re-architected the data flow
> > to add
> > >>> the Indexing topology.
> > >>>
> > >>
> > >>
> > >>> In an ideal world, a user should be responsible for defining the data
> > >>> flow, not Metron.  Metron should provide the "useful bits of
> > functionality"
> > >>> that a user can "plugin" wherever they like.  Metron itself should
> not
> > care
> > >>> how the data is moving or what step in the process it is at.
> > >>
> > >>
> > >>
> > >>
> > >> --
> > >> Nick Allen <nick@nickallen.org>
> > >>
> > >
> > >
> > >
> > > --
> > > Nick Allen <nick@nickallen.org>
> > >
> >
> >
> >
> > --
> > Nick Allen <nick@nickallen.org>
> >
> --
>
> Jon
>
-- 

Jon

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message