metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Casey Stella <ceste...@gmail.com>
Subject Re: Question about the customization of Metron with my machine learining algo.
Date Tue, 06 Jun 2017 17:43:47 GMT
So, first off, it's not a basic question at all and thanks for asking it.
I'm sure if it's not clear to you, then it's not clear to many and bears
some reinforcement and clarification.


   - Metron does indeed enable the deployment and use of machine learning
   models on data ingested into Metron
   - Metron runs atop Hadoop (storm + kafka + hdfs + hbase), so you likely
   wouldn't run this successfully on a VM, but rather a cluster.  We do
   support running Metron for demonstration purposes and development purposes
   inside a VM, but that's not a production configuration, I'd like to make
   clear.

Models deployed via MaaS can be interacted with via Stellar on data
ingested into Metron under a couple caveats.  There are two ways to ingest
data into Metron:

   - Via a packet capture sensor (fastcapa) to Kafka to the pcap storm
   topology, which writes directly to HDFS with no preamble or enrichment
   - Via another, lower velocity sensor (e.g. bro for deep packet
   inspection or yaf for flow data) which is routed to a parser topology, then
   to enrichment and finally to indexing

We do not, at present, support interacting with models (or, indeed, any
enrichment) on raw packet data (the first case above).  We do, however,
support it on the second usecase.  The example at https://github.com/apache/
metron/tree/master/metron-analytics/metron-maas-service#example
demonstrates ingesting web proxy data and using a dummy machine learning
model to pick out domains which are synthetic and likely to represent
communication to a botnet (the DGA model in that example is crude and could
easily be replaced with the example I posed earlier, btw).

Anyway, so for you to use your own ML model, you'd do the following:

   1. Ingest the sensor data source that you want to ingest into a kafka
   topic
   2. Create or reuse one of the existing parsers that we support to
   convert the data from your data source
   3. Create your model (see https://gist.github.com/cestella/
   8dd83031b8898a732b6a5a60fce1b616
   <https://gist.github.com/cestella/8dd83031b8898a732b6a5a60fce1b616> as
   an example)
   4. refer to your model from stellar
      1. In the example I mentioned, we're doing that at
      https://github.com/apache/metron/tree/master/metron-analytics/metron-maas-service#adjust-configurations-for-squid-to-call-model
      2. You might consider doing it in the enrichment topology, but to get
      you started, doing it as a field trasnformation as in the example should
      suffice

Hopefully that'll clear some things up.  I'm about to give a talk about
this next week at Dataworks summit, so I'll be sure to follow-up here with
the deck.  There's also a blog post that will eventually be going out with
this walked through more directly.

If I missed osmething or if something isn't clear yet, I'll be sure to keep
at it. :)

Best,

Casey

On Mon, Jun 5, 2017 at 1:21 PM, <smlabs@libero.it> wrote:

> Hello Casey,
>
> your answer makes something more clear, but not at all.
>
> My question about ML models was because somewhere on the web I read that
> Metron comes with ML.
> But maybe it's better to say that it supports ML models.
>
> If I understood well, I can run Metron in a virtual machine connected to
> my network. With NIFI I can select the protocols/packets that I would store
> (similar as Wireshark does).
>
> Then, I do not understand how to fill the data in to the ML algorithm.
>
> Can you try to explain me something more, or indicate any tutorial that
> can explain the implementation process.
>
> For example if I have an SVM algo that I would test into Metron and that
> ML algortihm has been developed in python using scikit-py.
>
> How can I do that?
>
> Thank you and I'm sorry for the very basic question.
>
> Best Regards,
>
> Simone
>
> Il 5 giugno 2017 alle 18.45 Casey Stella <cestella@gmail.com> ha scritto:
>
> We do not ship any ML models currently with metron, just the infrastructure
> to deploy your own models and interact with those models from within
> Metron. That being said, you might be interested in
> https://gist.github.com/cestella/8dd83031b8898a732b6a5a60fce1b616 That's
> the code to take a DGA model written in scikit-learn from
> https://github.com/ClickSecurity/data_hacking/tree/master/dga_detection
> and
> suitable for deployment via MaaS.
>
> If you want more information about MaaS, I'll be giving a talk on it next
> week at DataWorks Summit and that deck will be public.
>
> On Mon, Jun 5, 2017 at 12:09 PM, <smlabs@libero.it> wrote:
>
> Hello Simon,
>
> thank you for your prompt replay and for the link as well.
>
> I'm more confortable with Python.
>
> May I ask you if there is any example in python that I use as template to
> receive network packets and then implement the machine learning algorithm?
>
> Moreover, where can I find documentation about the ML algorithm already
> implemeneted into the Metron?
>
> Best Regards,
>
> Simone
>
> Il 5 giugno 2017 alle 18.00 Simon Elliston Ball <
> simon@simonellistonball.com> ha scritto:
>
> Hi Simone, and welcome to the community.
>
> There are a number of extension points in Metron, the key ones being
> around machine learning. I suggest taking a look at
> https://github.com/apache/metron/tree/master/metron-
> analytics/metron-maas-service for more information about the model as a
> service. This is the bit that helps you add models in pretty much any
> language that will run in a yarn container (python, R and spark models are
> probably the most popular).
>
> Hope that helps, and looking forward to hearing more about your
> research, and any contributions you feel like adding to the community.
>
> Simon
>
> On 5 Jun 2017, at 16:54, smlabs@libero.it mailto:
> smlabs@libero.it wrote:
>
> Dear community,
>
> my name is Simone and I'm researcher in the field of
> cybersecurity.
>
> I've just read about Apache Metron and I would ask:
>
>    -
>
>    does it use machine learning or artificial intelligence?
>    -
>
>    can I extend the machine learining algo already present into
>    the Metron with mines?
>    -
>
>    which is the language that I have to use to extend Metron
>    with my algorithms?
>
>    Thank you.
>
>    Best Regards,
>
>    Simone
>
>    >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message