metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Allen <n...@nickallen.org>
Subject Re: [Discuss] Improve Alerting
Date Wed, 01 Feb 2017 19:43:28 GMT
Agreed.

What do you think about using the existing Indexing topology to write the
data to HBase for the profiler?

   - The Profiler would have only one output; Kafka.  The Profiler would
   not write to HBase.


   - Since the Profiler is just another source of telemetry, it is parsed,
   enriched, triaged, and then indexed.


   - Thanks to your recent work we can now configure each 'indexer'
   separately, so we would just have an HBase indexer.

Seems like a logical extension of this idea.  Probably a little
overreaching as a first pass.  Maybe that is something we can evolve
towards.



On Wed, Feb 1, 2017 at 2:37 PM, Casey Stella <cestella@gmail.com> wrote:

> Yeah, I think your solution and mine are the same based on reading your
> suggestion.  Just add a write section to the profile and you can write
> right back into the kafka queue and get all the triage goodness.  You would
> need to ensure that you don't end up with infinite loops back in the
> profiler.  So, things like profiles that interact with EVERY message and
> send a message back to the kafka queue in enrichment would be bad.
>
> On Wed, Feb 1, 2017 at 2:35 PM, Nick Allen <nick@nickallen.org> wrote:
>
> > Great.  I think we're thinking along the same lines.  I just sent a
> > follow-up of another proposal that takes this idea a little further.
> What
> > if we treated the Profiler as another source of telemetry?
> >
> > On Wed, Feb 1, 2017 at 2:23 PM, Casey Stella <cestella@gmail.com> wrote:
> >
> > > Regarding point 2, could we enable the profiler to write data to kafka
> > and
> > > the enrichment queue?
> > >
> > > I'm proposing the profiler do something like this:
> > >
> > >    - Count the number of inbound flows
> > >    - On the tick, send a message to the enrichment queue containing:
> > >       - the number of flows
> > >       - A source type of 'system_alert'
> > >       - is_alert set to true
> > >    - In enrichment, we enrich and triage system_alert source data in
> the
> > >    same way we do any other.
> > >
> > > This would not solve the transparency issue, but at least make it so we
> > > keep triage in one place in the architecture.  Also, enabling kafka
> > writing
> > > would enable other types of use-cases, like situations where we find
> > > outliers *directly* in the profile and send the alerts directly to the
> > > indexing queue without triage.
> > >
> > > The only changes this proposal would require would be
> > >
> > >    1. a "write" section to a profile that takes a list of stellar
> > >    statements and gets run on the tick write
> > >    2. fixing the kafka writing stellar functions
> > >
> > > Casey
> > >
> > > On Wed, Feb 1, 2017 at 2:11 PM, Nick Allen <nick@nickallen.org> wrote:
> > >
> > > > I'd like to explore the functionality that we have in Metron using a
> > > > motivating example.  I think this will help highlight some gaps where
> > we
> > > > can enhance Metron.
> > > >
> > > > The motivating example is that I would like to create an alert if the
> > > > number of inbound flows to any host over a 15 minute interval is
> > > abnormal.
> > > > I would like the alert to contain the specific information below to
> > > > streamline the triage process.
> > > >
> > > > Rule: Abnormal number of inbound flows
> > > > Bin: 15 mins
> > > > Alert: The host 'powned.svr.bank.com' has '230' inbound flows,
> > exceeding
> > > > the threshold of '202'
> > > >
> > > >
> > > > *What Works*
> > > >
> > > > In some ways, this example is similar to the "Outlier Detection" demo
> > > that
> > > > I performed with the Profiler a few months back.   We have most of
> what
> > > we
> > > > need to do this with a couple caveats.
> > > >
> > > > 1. An enrichment would be added to enrich the message with the
> correct
> > > > internal hostname 'powned.svr.bank.com'.
> > > >
> > > > 2. With the Profiler, I can capture some idea of what "normal" is for
> > the
> > > > number of inbound flows across 15 minute intervals.
> > > > 3. With Threat Triage, I can create rules that alert when a value
> > exceeds
> > > > what the Profiler defines as normal.
> > > >
> > > >
> > > > *What's Missing*
> > > >
> > > > Its nice to know that we are almost all the way there with this
> > example.
> > > > Unfortunately, there are two gaps that fall out of this.
> > > >
> > > >  1. *Threat Triage Transparency*
> > > >
> > > > There is little transparency into the Threat Triage process itself.
> > When
> > > > Threat Triage runs, all I get is a score.  I don't know how that
> score
> > > was
> > > > arrived at, which rules were triggered, and the specific values that
> > > caused
> > > > a rule to trigger.
> > > >
> > > > More specifically, there is no way to generate a message that looks
> > like
> > > > "The host 'powned.svr.bank.com' has '230' inbound flows, exceeding
> the
> > > > threshold of '202'".
> > > >
> > > >
> > > > 2. *Triage Calculated Values from the Profiler*
> > > >
> > > > Also, the value being interrogated here, the number of inbound flows,
> > is
> > > > not a static value contained within any single telemetry message.
> This
> > > > value is calculated across multiple messages by the Profiler.  The
> > > current
> > > > Threat Triage process cannot be used to interrogate values calculated
> > by
> > > > the Profiler.
> > > >
> > > >
> > > > To try and keep this email concise and digestible, I am going to
> send a
> > > > follow-on discussing proposed solutions for each of these separately.
> > > >
> > >
> >
> >
> >
> > --
> > Nick Allen <nick@nickallen.org>
> >
>



-- 
Nick Allen <nick@nickallen.org>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message