metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Allen <n...@nickallen.org>
Subject Re: Name conventions for parsers
Date Thu, 06 Oct 2016 16:48:41 GMT
In general, Metron is very opinionated about data flows right now.  We have
Parser topologies that feed an Enrichment topology, which then feeds an
Indexing topology.  We have useful bits of functionality (think Stellar
transforms, Geo enrichment, etc) that are closely coupled with these
topologies (aka data flows).

When a user wants to parse heterogenous data from a single topic, that's
not easy.  When a user wants enriched output to land in unique topics by
sensor type, well, that's also not easy.    When a user wanted to skip
enrichment of data sources, we actually re-architected the data flow to add
the Indexing topology.

In an ideal world, a user should be responsible for defining the data flow,
not Metron.  Metron should provide the "useful bits of functionality" that
a user can "plugin" wherever they like.  Metron itself should not care how
the data is moving or what step in the process it is at.






On Thu, Oct 6, 2016 at 12:27 PM, Nick Allen <nick@nickallen.org> wrote:

> I think that's a good problem to solve, Jon.  Having some way to handle
> different types of data hitting the same Kafka topic, would be a very
> common problem.  We should make this easy to handle.  And as Simon
> mentioned, it solves the problem of ingesting low-volume data streams where
> the cost of a dedicated topology is overkill.
>
> Syslog is a good example use case.  Another example use case might be
> extracting data out of Splunk.  I worked at an organization that was using
> Splunk as the centralized log store to meet regulatory requirements.  Of
> course, Splunk is expensive so overlaying additional functionality on the
> existing installation was cost prohibitive.  The only efficient way we
> could get data out of Splunk was one big pipe containing heterogenous data.
> Perhaps there are other ways around it now.  I am no Splunk expert, but
> this seems like a common problem.
>
>
>
> On Thu, Oct 6, 2016 at 11:51 AM, Zeolla@GMail.com <zeolla@gmail.com>
> wrote:
>
>> A storm splitter gateway topology was another path that I considered,
>> especially because it would allow configs like what Yohann mentioned
>> earlier with:
>>
>> > So, it would be really useful that Metron could handle a syslog flow
>> > and automatically apply the right parser for each log. In order to
>> > help Metron, a config could be provide by the "Security Platform
>> > Engineer" to preselect a list of parser per device (as you know what
>> > type of logs a device  should send).  This feature exists in
>> > commercial SIEM.
>>
>> It's just not as easy to get going as an upstream splitter and/or parser
>> in
>> my scenario.
>>
>> Perhaps that should be an enhancement JIRA though?  I really think we need
>> to lower the barrier to getting logs to Metron in the first place, even
>> going as far as having a syslog listener (I looked at embedding rsyslog
>> and
>> syslog-ng and they both unfortunately are GPL licensed, so that's out...).
>>
>> Jon
>>
>> On Thu, Oct 6, 2016 at 9:58 AM Otto Fowler <ottobackwards@gmail.com>
>> wrote:
>>
>> Each of these split things would need to end up in their own topology,
>> since they would each have different STELLAR and Enrichment
>> configurations.
>>
>> It would be simpler I think to split them than to have a topology chain
>> that ‘switches’ over a type of field and muddy stellar configs etc.
>>
>> If that is true, then the question is to split as part of the external
>> delivery ( not metron’s problem ) in NiFi or XXXX, or to have a ‘gateway -
>> splitter’ topology with only split rules to feed the other typed
>> topologies.
>>
>> Or I’m totally wrong and you can forgive me ;)
>>
>> O
>>
>>
>> On October 6, 2016 at 08:32:51, Zeolla@GMail.com (zeolla@gmail.com)
>> wrote:
>>
>> If we don't do it by device I would be concerned that some more
>> appliance-based systems wouldn't allow the flexibility to split things up
>> to different destinations, nor would they allow external additions (NiFi,
>> etc.). This where I am right now, where I can send from certain appliances
>> into my syslog infrastructure, then either force my syslog architecture to
>> selectively send onto Metron, or parse and then send into a generic JSON
>> parser (I will probably go the latter route). In order to standardize and
>> simplify, I would suggest continuing down the device-based route.
>>
>> Generally, I expect the community to grow and for parsers to just exist,
>> and some users to only do minor updates to them or throw together grok
>> parsers using GROK_PREDICT() where necessary. In fact I would hope that is
>> the case, as it would indicate a broader user base.
>>
>> Jon
>>
>> On Thu, Oct 6, 2016 at 8:02 AM Simon Elliston Ball <
>> simon@simonellistonball.com> wrote:
>>
>> > > On 6 Oct 2016, at 12:22, Yohann Lepage <yohann@lepage.info> wrote:
>> > >
>> > > 2016-10-06 12:21 GMT+02:00 Zeolla@GMail.com <zeolla@gmail.com>:
>> > >> I would think that instead we work to make each parser able to handle
>> > all
>> > >> the known outputs (and document explicitly what outputs per parser
>> are
>> > >> supported) from a product and go back to vendor_product, with
>> versions
>> > of
>> > >> the product supported/tested and version of the parser being stored
>> in
>> > code
>> > >> and documentation only.
>> > > +1
>> > >
>> >
>> > +1 - this is similar to the evolving schema problem, and probably
>> belongs
>> > in code.
>> >
>> > >> I'm currently working on mechanisms to get logs into Metron most
>> > >> efficiently because all of my syslog comes in one big pipe.
>> > > I have a similar use case. Most of the time, admins are ok to forward
>> > > logs from rsyslog/syslog-ng to the SIEM as they don't want to install
>> > > an agent ( *.* @@siem.intra:514;).
>> > >
>> > > The result is that you receive a mix of log
>> > > (sudo/apache/mysql/audit/etc) from the same device and the SIEM have
>> > > to deals with it.
>> > >
>> > > So, it would be really useful that Metron could handle a syslog flow
>> > > and automatically apply the right parser for each log. In order to
>> > > help Metron, a config could be provide by the "Security Platform
>> > > Engineer" to preselect a list of parser per device (as you know what
>> > > type of logs a device should send). This feature exists in
>> > > commercial SIEM.
>> > >
>> >
>> > +1 for this too. One question though, do you think it’s viable to do
>> this
>> > by device. I would expect multiple types of syslog coming from the same
>> > physical device, especially when dealing with things like server logs.
>> >
>> > This could be handled with minimal parse and routing in NiFi
>> potentially,
>> > but that may make setup more complex than the sort of mapping you’re
>> > talking about here. Thoughts?
>> >
>> > Simon
>>
>> --
>>
>> Jon
>>
>> --
>>
>> Jon
>>
>
>
>
> --
> Nick Allen <nick@nickallen.org>
>



-- 
Nick Allen <nick@nickallen.org>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message