metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Allen <n...@nickallen.org>
Subject Re: [Discuss] Improve Alerting
Date Wed, 01 Feb 2017 19:21:12 GMT
Like I said, here is a proposed solution to one of the gaps I identified in
the previous email.

*Problem*

There is little transparency into the Threat Triage process itself.  When
Threat Triage runs, all I get is a score.  I don't know how that score was
arrived at, which rules were triggered, and the specific values that caused
a rule to trigger.

More specifically, there is no way to generate a message that looks like
"The host 'powned.svr.bank.com' has '230' inbound flows, exceeding the
threshold of '202'".  This makes it difficult for an analyst to action the
alert.

*Proposed Solution*

To improve the transparency of the Threat Triage process, I am proposing
these enhancements.

1. Threat Triage should attach to each message all of the rules that fired
in addition to the total calculated threat triage score.

2. Threat Triage should allow a custom message to be generated for each
rule.  The custom message would allow for some form of string interpolation
so that I can add specific values from each message to the generated
alert.  We could allow this in one or both of the new fields that Casey
just added, name and comment.


*Example*

1. In this example, we have a telemetry message with a field called 'value'
that we need to monitor.  In Enrichment, I calculate some sort of value
threshold, over which an alert should be generated.


2. In Threat Triage, I use the calculated value threshold to alert on any
message that has a value exceeding this threshold.

3. I can embed values from the message, like the hostname, value, and value
threshold, into the alert produced by Threat Triage.  Notice that I am
using ${this} for string interpolation, but it could be any syntax that we
choose.


"triageConfig" : {
  "riskLevelRules" : [
    {
      "name" : "Abnormal Value",
      "comment" : "For ${hostname}; the value ${value} exceeds threshold of
${value_threshold}",
      "rule" : "value > value_threshold",
      "score" : 10
    }
  ],
  "aggregator" : "MAX"
}


4. The Threat Triage process today would add only the total calculated
score.

"threat.triage.level": 10.0


With this proposal, Threat Triage would add the following to the message.

Notice how each of the ${variables} have been replaced with the actual
values extracted from the message.  This allows for more contextual
information to action the alert.

"threat.triage": {
    "score": 10.0,
    "rules": [
      {
        "name": "Abnormal Value",
        "comment" : "For 10.0.0.1; the value 101 exceeds threshold of 42",
        "score" : 10
      }
    ]
}



What do you think?  Any alternative ideas?



On Wed, Feb 1, 2017 at 2:11 PM, Nick Allen <nick@nickallen.org> wrote:

> I'd like to explore the functionality that we have in Metron using a
> motivating example.  I think this will help highlight some gaps where we
> can enhance Metron.
>
> The motivating example is that I would like to create an alert if the
> number of inbound flows to any host over a 15 minute interval is abnormal.
> I would like the alert to contain the specific information below to
> streamline the triage process.
>
> Rule: Abnormal number of inbound flows
> Bin: 15 mins
> Alert: The host 'powned.svr.bank.com' has '230' inbound flows, exceeding
> the threshold of '202'
>
>
> *What Works*
>
> In some ways, this example is similar to the "Outlier Detection" demo that
> I performed with the Profiler a few months back.   We have most of what we
> need to do this with a couple caveats.
>
> 1. An enrichment would be added to enrich the message with the correct
> internal hostname 'powned.svr.bank.com'.
>
> 2. With the Profiler, I can capture some idea of what "normal" is for the
> number of inbound flows across 15 minute intervals.
> 3. With Threat Triage, I can create rules that alert when a value exceeds
> what the Profiler defines as normal.
>
>
> *What's Missing*
>
> Its nice to know that we are almost all the way there with this example.
> Unfortunately, there are two gaps that fall out of this.
>
>  1. *Threat Triage Transparency*
>
> There is little transparency into the Threat Triage process itself.  When
> Threat Triage runs, all I get is a score.  I don't know how that score was
> arrived at, which rules were triggered, and the specific values that caused
> a rule to trigger.
>
> More specifically, there is no way to generate a message that looks like
> "The host 'powned.svr.bank.com' has '230' inbound flows, exceeding the
> threshold of '202'".
>
>
> 2. *Triage Calculated Values from the Profiler*
>
> Also, the value being interrogated here, the number of inbound flows, is
> not a static value contained within any single telemetry message.  This
> value is calculated across multiple messages by the Profiler.  The current
> Threat Triage process cannot be used to interrogate values calculated by
> the Profiler.
>
>
> To try and keep this email concise and digestible, I am going to send a
> follow-on discussing proposed solutions for each of these separately.
>
>
>
>
>
>


-- 
Nick Allen <nick@nickallen.org>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message