metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Allen <n...@nickallen.org>
Subject Re: [Discuss] Improve Alerting
Date Thu, 02 Feb 2017 15:15:17 GMT
I created 3 separate JIRAs to track the "Threat Triage Transparency"
portion of the work falling out of this discussion thread.  The first would
create a mechanism to do string interpolation.  The second would enhance
threat triage to use the string interpolation.  The third would enhance the
output of threat triage.

[1] Create String Formatting Function for Stellar
https://issues.apache.org/jira/browse/METRON-687

[2] Allow Threat Triage Comment Field to Contain Stellar Expressions
https://issues.apache.org/jira/browse/METRON-688

[3] Record of Rule Set that Fired During Threat Triage
https://issues.apache.org/jira/browse/METRON-686

Please let me know if anyone's concerns were not captured.  I will create
additional JIRAs for the other portion of the effort (*Triage Calculated
Values from the Profiler)* once I've given everyone a little more time to
voice an opinion.
​

On Thu, Feb 2, 2017 at 9:46 AM, Nick Allen <nick@nickallen.org> wrote:

> Oh, I see.  Yes, very useful.
>
>
> On Thu, Feb 2, 2017 at 9:39 AM, Simon Elliston Ball <
> simon@simonellistonball.com> wrote:
>
>> That’s a part of it, certainly (and fixes another of my bug bears, so
>> thank you!)
>>
>> In addition to the aggregation being stellar, I want score to be a
>> stellar statement, I’ve put in a separate ticket for that.
>> https://issues.apache.org/jira/browse/METRON-685 <
>> https://issues.apache.org/jira/browse/METRON-685>
>>
>> Simon
>>
>> > On 2 Feb 2017, at 14:31, Nick Allen <nick@nickallen.org> wrote:
>> >
>> >> I would much rather be able to say something like score = some stellar
>> >> statement that returns a float...
>> >
>> >
>> > Completely agree.  FYI - We added METRON-683 yesterday that I believe
>> > supports what you are saying.  Feel free to add commentary.
>> >
>> > https://issues.apache.org/jira/browse/METRON-683
>> >
>> > On Thu, Feb 2, 2017 at 9:02 AM, Simon Elliston Ball <
>> > simon@simonellistonball.com> wrote:
>> >
>> >> I completely agree with Nick’s transparency comments, and like the
>> design
>> >> of the configuration, especially provision for messaging around the
>> nature
>> >> of the rule fired.
>> >>
>> >> I would just like to add a small point on the capabilities here. If the
>> >> message could have embedded values through some sort of template for a
>> >> stellar statement, it would make for a better more dynamic alert
>> reason.
>> >>
>> >> I would also like to see the score field capable of outputting the
>> value
>> >> of a stellar statement. At the moment the idea of a static score being
>> >> passed on means that if I have a probabilistic result I want to combine
>> >> with other triage sources, I have to do a lot of bucketing into fixed
>> >> values. I would much rather be able to say something like score = some
>> >> stellar statement that returns a float, ‘alertness' = threshold of
>> this.
>> >> That way I can combine multiple triage rules to trigger an overall
>> alert,
>> >> making the aggregators more meaningful.
>> >>
>> >> Simon
>> >>
>> >>
>> >>> On 2 Feb 2017, at 12:40, Carolyn Duby <cduby@hortonworks.com>
wrote:
>> >>>
>> >>> For profiler alerts it will be helpful during analysis to see the
>> alerts
>> >> that caused the anomaly.  The meta alert is useful for incidents
>> involving
>> >> correlation of multiple events.
>> >>>
>> >>> Also you will need to filter out known hosts that trigger anomalies.
>> >> For example vulnerability scanning software.
>> >>>
>> >>> One final thing to consider is anomalies happen every day without a
>> >> security incident.  Depending on the network the profiler alerts could
>> get
>> >> very noisy so it might be better to correlate profiler alerts with
>> other
>> >> alerts.
>> >>>
>> >>> Thanks
>> >>> Carolyn
>> >>>
>> >>>
>> >>>
>> >>> Sent from my Verizon, Samsung Galaxy smartphone
>> >>>
>> >>>
>> >>> -------- Original message --------
>> >>> From: Casey Stella <cestella@gmail.com>
>> >>> Date: 2/1/17 2:28 PM (GMT-05:00)
>> >>> To: dev@metron.incubator.apache.org
>> >>> Subject: Re: [Discuss] Improve Alerting
>> >>>
>> >>> I like the direction.  One thing that we may want is for comment to
>> just
>> >> be
>> >>> a stellar expression and construct a function to essentially do
>> >>> String.format().  So, that'd become:
>> >>> "triageConfig" : {
>> >>> "riskLevelRules" : [
>> >>>   {
>> >>>     "name" : "Abnormal Value",
>> >>>     "comment" : "FORMAT('For %s; the value %s exceeds threshold of
>> %d',
>> >>> hostname, value, value_threshold)"
>> >>>     "rule" : "value > value_threshold",
>> >>>     "score" : 10
>> >>>   }
>> >>> ],
>> >>> "aggregator" : "MAX"
>> >>> }
>> >>>
>> >>> The reason:
>> >>>
>> >>>  - It's integrated and stellar is our default scripting layer
>> >>>  - It supports doing some computation in the message
>> >>>
>> >>>
>> >>> On Wed, Feb 1, 2017 at 2:21 PM, Nick Allen <nick@nickallen.org>
>> wrote:
>> >>>
>> >>>> Like I said, here is a proposed solution to one of the gaps I
>> >> identified in
>> >>>> the previous email.
>> >>>>
>> >>>> *Problem*
>> >>>>
>> >>>> There is little transparency into the Threat Triage process itself.
>> >> When
>> >>>> Threat Triage runs, all I get is a score.  I don't know how that
>> score
>> >> was
>> >>>> arrived at, which rules were triggered, and the specific values
that
>> >> caused
>> >>>> a rule to trigger.
>> >>>>
>> >>>> More specifically, there is no way to generate a message that looks
>> like
>> >>>> "The host 'powned.svr.bank.com' has '230' inbound flows, exceeding
>> the
>> >>>> threshold of '202'".  This makes it difficult for an analyst to
>> action
>> >> the
>> >>>> alert.
>> >>>>
>> >>>> *Proposed Solution*
>> >>>>
>> >>>> To improve the transparency of the Threat Triage process, I am
>> proposing
>> >>>> these enhancements.
>> >>>>
>> >>>> 1. Threat Triage should attach to each message all of the rules
that
>> >> fired
>> >>>> in addition to the total calculated threat triage score.
>> >>>>
>> >>>> 2. Threat Triage should allow a custom message to be generated for
>> each
>> >>>> rule.  The custom message would allow for some form of string
>> >> interpolation
>> >>>> so that I can add specific values from each message to the generated
>> >>>> alert.  We could allow this in one or both of the new fields that
>> Casey
>> >>>> just added, name and comment.
>> >>>>
>> >>>>
>> >>>> *Example*
>> >>>>
>> >>>> 1. In this example, we have a telemetry message with a field called
>> >> 'value'
>> >>>> that we need to monitor.  In Enrichment, I calculate some sort of
>> value
>> >>>> threshold, over which an alert should be generated.
>> >>>>
>> >>>>
>> >>>> 2. In Threat Triage, I use the calculated value threshold to alert
on
>> >> any
>> >>>> message that has a value exceeding this threshold.
>> >>>>
>> >>>> 3. I can embed values from the message, like the hostname, value,
and
>> >> value
>> >>>> threshold, into the alert produced by Threat Triage.  Notice that
I
>> am
>> >>>> using ${this} for string interpolation, but it could be any syntax
>> that
>> >> we
>> >>>> choose.
>> >>>>
>> >>>>
>> >>>> "triageConfig" : {
>> >>>> "riskLevelRules" : [
>> >>>>   {
>> >>>>     "name" : "Abnormal Value",
>> >>>>     "comment" : "For ${hostname}; the value ${value} exceeds
>> threshold
>> >> of
>> >>>> ${value_threshold}",
>> >>>>     "rule" : "value > value_threshold",
>> >>>>     "score" : 10
>> >>>>   }
>> >>>> ],
>> >>>> "aggregator" : "MAX"
>> >>>> }
>> >>>>
>> >>>>
>> >>>> 4. The Threat Triage process today would add only the total
>> calculated
>> >>>> score.
>> >>>>
>> >>>> "threat.triage.level": 10.0
>> >>>>
>> >>>>
>> >>>> With this proposal, Threat Triage would add the following to the
>> >> message.
>> >>>>
>> >>>> Notice how each of the ${variables} have been replaced with the
>> actual
>> >>>> values extracted from the message.  This allows for more contextual
>> >>>> information to action the alert.
>> >>>>
>> >>>> "threat.triage": {
>> >>>>   "score": 10.0,
>> >>>>   "rules": [
>> >>>>     {
>> >>>>       "name": "Abnormal Value",
>> >>>>       "comment" : "For 10.0.0.1; the value 101 exceeds threshold
of
>> >> 42",
>> >>>>       "score" : 10
>> >>>>     }
>> >>>>   ]
>> >>>> }
>> >>>>
>> >>>>
>> >>>>
>> >>>> What do you think?  Any alternative ideas?
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Wed, Feb 1, 2017 at 2:11 PM, Nick Allen <nick@nickallen.org>
>> wrote:
>> >>>>
>> >>>>> I'd like to explore the functionality that we have in Metron
using a
>> >>>>> motivating example.  I think this will help highlight some gaps
>> where
>> >> we
>> >>>>> can enhance Metron.
>> >>>>>
>> >>>>> The motivating example is that I would like to create an alert
if
>> the
>> >>>>> number of inbound flows to any host over a 15 minute interval
is
>> >>>> abnormal.
>> >>>>> I would like the alert to contain the specific information below
to
>> >>>>> streamline the triage process.
>> >>>>>
>> >>>>> Rule: Abnormal number of inbound flows
>> >>>>> Bin: 15 mins
>> >>>>> Alert: The host 'powned.svr.bank.com' has '230' inbound flows,
>> >> exceeding
>> >>>>> the threshold of '202'
>> >>>>>
>> >>>>>
>> >>>>> *What Works*
>> >>>>>
>> >>>>> In some ways, this example is similar to the "Outlier Detection"
>> demo
>> >>>> that
>> >>>>> I performed with the Profiler a few months back.   We have most
of
>> what
>> >>>> we
>> >>>>> need to do this with a couple caveats.
>> >>>>>
>> >>>>> 1. An enrichment would be added to enrich the message with the
>> correct
>> >>>>> internal hostname 'powned.svr.bank.com'.
>> >>>>>
>> >>>>> 2. With the Profiler, I can capture some idea of what "normal"
is
>> for
>> >> the
>> >>>>> number of inbound flows across 15 minute intervals.
>> >>>>> 3. With Threat Triage, I can create rules that alert when a
value
>> >> exceeds
>> >>>>> what the Profiler defines as normal.
>> >>>>>
>> >>>>>
>> >>>>> *What's Missing*
>> >>>>>
>> >>>>> Its nice to know that we are almost all the way there with this
>> >> example.
>> >>>>> Unfortunately, there are two gaps that fall out of this.
>> >>>>>
>> >>>>> 1. *Threat Triage Transparency*
>> >>>>>
>> >>>>> There is little transparency into the Threat Triage process
itself.
>> >> When
>> >>>>> Threat Triage runs, all I get is a score.  I don't know how
that
>> score
>> >>>> was
>> >>>>> arrived at, which rules were triggered, and the specific values
that
>> >>>> caused
>> >>>>> a rule to trigger.
>> >>>>>
>> >>>>> More specifically, there is no way to generate a message that
looks
>> >> like
>> >>>>> "The host 'powned.svr.bank.com' has '230' inbound flows, exceeding
>> the
>> >>>>> threshold of '202'".
>> >>>>>
>> >>>>>
>> >>>>> 2. *Triage Calculated Values from the Profiler*
>> >>>>>
>> >>>>> Also, the value being interrogated here, the number of inbound
>> flows,
>> >> is
>> >>>>> not a static value contained within any single telemetry message.
>> This
>> >>>>> value is calculated across multiple messages by the Profiler.
 The
>> >>>> current
>> >>>>> Threat Triage process cannot be used to interrogate values
>> calculated
>> >> by
>> >>>>> the Profiler.
>> >>>>>
>> >>>>>
>> >>>>> To try and keep this email concise and digestible, I am going
to
>> send a
>> >>>>> follow-on discussing proposed solutions for each of these
>> separately.
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Nick Allen <nick@nickallen.org>
>> >>>>
>> >>
>> >>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message