spot-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alan Ross <a...@apache.org>
Subject Re: Spot Suspicious Connects Description and questions related to 'feedback' from UI to ML
Date Thu, 25 May 2017 17:50:48 GMT
On the scoring piece.  1 has traditionally been "Bad" and 3 has been
"Benign".  Are we changing that?

Alan

On Thu, May 25, 2017 at 10:49 AM, Alan Ross <alan@apache.org> wrote:

> I don't believe this list permits attachments Brandon.  Perhaps post it to
> google docs and send out a link?
>
> Alan
>
> On Thu, May 25, 2017 at 10:27 AM, Edwards, Brandon <
> brandon.edwards@intel.com> wrote:
>
>> Hi all,
>>
>>
>>
>> I am attaching the document that describes how Spot uses LDA in order to
>> perform anomaly detection on network events. I have also received multiple
>> questions related to how the ‘user scoring’ (‘feedback’) of particular
>> items in the suspicious connects report (in the UI layer) is used in ML. We
>> have not provided much detail on this functionality in the attached
>> document. I thought I’d put an explanation out there and we can discuss
>> questions related to my explanation and discuss what additional info should
>> be included in the attached document.
>>
>>
>>
>> The Spot team feels that changes are needed to this ‘feedback’
>> functionality, and see these changes as happening concurrent with
>> improvements to the ability for context from an LDA model trained on a
>> given batch of data to be carried forward to the next training run (or even
>> training in a streaming use case). The value of ‘feedback’ is dependent on
>> the quality of the model-context we can carry over.
>>
>>
>>
>> The idea for feedback is as follows. The items that are scored with a 1
>> (i.e. the user identifies the item as benign and so does not want to see it
>> in the suspicious connects report anymore) will be used for letting the
>> machine learning component know that such an entry should not be considered
>> as suspicious anymore. Currently this is done by injecting artificial log
>> entries into the next batch of data so that LDA sees many such entries and
>> therefore no longer sees them as anomalies.
>>
>>
>>
>> We have ideas for other ways to allow this functionality - for example we
>> could filter entries matching the identified pattern from the next batch
>> run BEFORE ML runs on the batch. For items that are scored by the user in
>> the UI as ‘3’ (for example the user sees an ip as so suspicious that we
>> want to see all future log entries associated to that ip) we could filter
>> future items matching such a pattern in order to skip ML and instead report
>> them in a separate pane of the UI or insert them to the top of the most
>> suspicious events.
>>
>>
>>
>> Comments, Questions?
>>
>> Brandon
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message