metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zeolla@GMail.com" <zeo...@gmail.com>
Subject Re: [DISCUSS] Opinionated Data Flows
Date Thu, 06 Oct 2016 17:09:44 GMT
One of those users gives this a +1.  This also appears related to METRON-477
<https://issues.apache.org/jira/browse/METRON-477>, except that 477 is more
focused on data flow once it hits disk and this is during ingest/stream
processing.  At the end of the day, not that different IMO.  Would love to
see it all managed via Stellar/zookeeper.

Jon

On Thu, Oct 6, 2016 at 1:00 PM Nick Allen <nick@nickallen.org> wrote:

In reality, the current "engine" is Storm + Kafka + HBase.  Each of these
could be independently swapped out once Metron is just a DSL with multiple
underlying engines.

Ok, I'll stop.

On Thu, Oct 6, 2016 at 12:58 PM, Nick Allen <nick@nickallen.org> wrote:

> Chasing this bad idea down even further leads me to something even
> crazier.
>
> Stellar 1.0 can only operate within a single topology and in most cases
> only on a single message.  Stellar 2.0 could be the mechanism that allows
> users to define their own data flows and what "useful bits of Metron
> functionality" get plugged-in.
>
> Once, you have a DSL that allows users to define what they want Metron to
> do, then the underlying implementation mechanism (which is currently
Storm)
> can also be swapped-out.  If we have an even faster Storm implementation,
> then we swap in the Storm NG engine.  Maybe we want Metron to also run in
> Flink, then we just swap-in a Flink engine.
>
>
>
>
> On Thu, Oct 6, 2016 at 12:52 PM, Nick Allen <nick@nickallen.org> wrote:
>
>> I totally "bird dogged the previous thread" as Casey likes to call it. :)
>>  I am extracting this thought into a separate thread before I start
>> throwing out even more, crazier ideas.
>>
>> In general, Metron is very opinionated about data flows right now.  We
>>> have Parser topologies that feed an Enrichment topology, which then
feeds
>>> an Indexing topology.  We have useful bits of functionality (think
Stellar
>>> transforms, Geo enrichment, etc) that are closely coupled with these
>>> topologies (aka data flows).
>>>
>>
>>
>>> When a user wants to parse heterogenous data from a single topic, that's
>>> not easy.  When a user wants enriched output to land in unique topics by
>>> sensor type, well, that's also not easy.    When a user wanted to skip
>>> enrichment of data sources, we actually re-architected the data flow to
add
>>> the Indexing topology.
>>>
>>
>>
>>> In an ideal world, a user should be responsible for defining the data
>>> flow, not Metron.  Metron should provide the "useful bits of
functionality"
>>> that a user can "plugin" wherever they like.  Metron itself should not
care
>>> how the data is moving or what step in the process it is at.
>>
>>
>>
>>
>> --
>> Nick Allen <nick@nickallen.org>
>>
>
>
>
> --
> Nick Allen <nick@nickallen.org>
>



--
Nick Allen <nick@nickallen.org>

-- 

Jon

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message