metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Casey Stella <ceste...@gmail.com>
Subject Re: Profiler statistics NaN
Date Wed, 09 Aug 2017 19:48:37 GMT
Ok, so the problem here is that your profile is returning integers
(specifically HLLP cardinalities) rather than stats objects.  When you're
doing:
    STATS_PERCENTILE(STATS_MERGE( PROFILE_GET('host-talks-to',
'99.191.183.156', PROFILE_FIXED(10, 'HOURS')), 90)
You are calling STATS_MERGE on a list of integers and it takes a list of
statistics objects.

What you can do instead is:
    STATS_PERCENTILE( REDUCE( PROFILE_GET('host-talks-to',
'99.191.183.156', PROFILE_FIXED(10, 'HOURS'), (s, x) -> STATS_ADD(s, x),
STATS_INIT()), 90)

Ok, that looks horrible, doesn't it?  Well, thankfully we added temporary
variables for stellar enrichments in 0.4.1.  Let's take that "numeric"
stellar enrichment group and reimagine it.  With temporary variables, you
would turn:

"numeric" : {
             "value_red_level_out": "STATS_PERCENTILE( REDUCE(
 PROFILE_GET('host-being-talked-to', ip_src_addr, PROFILE_FIXED(1,
'HOURS')), (s, x) -> STATS_ADD(s, x), STATS_INIT()), 95)",
             "value_red_level_in": "STATS_PERCENTILE( REDUCE(
PROFILE_GET('host-talks-to',
ip_src_addr, PROFILE_FIXED(1, 'HOURS')), (s, x) -> STATS_ADD(s, x),
STATS_INIT()), 95)"
           },

into:
"numeric" : [
             "profile_duration := PROFILE_FIXED(1, 'HOURS')",
             "host_being_talked_to := PROFILE_GET('host-being-talked-to',
ip_src_addr, profile_duration)",
             "host_talks_to := PROFILE_GET('host-talks-to', ip_src_addr,
profile_duration)",
             "host_being_talked_to_stats := REDUCE( host_being_talked_to,
(s, x) -> STATS_ADD(s, x), STATS_INIT())",
             "host_talks_to_stats := REDUCE(host_talks_to, (s, x) ->
STATS_ADD(s, x), STATS_INIT())",
             "value_red_level_out": "STATS_PERCENTILE(
host_being_talked_to_stats, 95)",
             "value_red_level_in": "STATS_PERCENTILE( host_talks_to_stats,
95)",
             "profile_duration := null",
             "host_being_talked_to := null",
             "host_talks_to := null",
             "host_being_talked_to_stats := null",
             host_talks_to_stats := null"
           ],

That's a lot more to type, but it allows you to reuse and take the pieces
in chunks.

Ok, so now I find myself thinking "a pox on both your houses" since both
examples now kinda look long and convoluted.  So, why are they?  Well, that
REDUCE is likely the culprit.  It's supposed to get us out of bad
situations not show up in what could be argued is the 80% case.  How about,
instead, we allow STATS_ADD  or STATS_INIT to take a list of numbers?  If
so, we could pretty easily make that nicer:
    STATS_PERCENTILE( STATS_ADD(  PROFILE_GET('host-being-talked-to',
ip_src_addr, PROFILE_FIXED(1, 'HOURS'))), 95)

or
     STATS_PERCENTILE( STATS_INIT(  PROFILE_GET('host-being-talked-to',
ip_src_addr, PROFILE_FIXED(1, 'HOURS'))), 95)


We should make some sort of candy like that so we can avoid some of the
complexity in the normal case.

On Wed, Aug 9, 2017 at 3:03 PM, Dima Kovalyov <Dima.Kovalyov@sstech.us>
wrote:

> Hello Metron Team,
>
> I have created following profiler:
> > {
> >   "profile": "host-talks-to",
> >   "onlyif": "exists(source_ip)",
> >   "foreach": "source_ip",
> >   "init": {
> >     "outcoming": "HLLP_INIT(5, 6)"
> >           },
> >   "update": { "outcoming": "HLLP_ADD(outcoming, destination_ip)" },
> >   "result": "HLLP_CARDINALITY(outcoming)"
> > }
> I have also created enrichment rule:
> > {
> >   "enrichment" : {
> >     "fieldMap": {
> >       "stellar" : {
> >         "config" : {
> >           "numeric" : {
> >             "value_red_level_out": "STATS_PERCENTILE( STATS_MERGE(
> > PROFILE_GET('host-being-talked-to', ip_src_addr, 1, 'HOURS')), 95)",
> >             "value_red_level_in": "STATS_PERCENTILE( STATS_MERGE(
> > PROFILE_GET('host-talks-to', ip_src_addr, 1, 'HOURS')), 95)"
> >           },
> >           "text" : {
> >             "is_alert": "true"
> >           }
> >         }
> >       }
> >     }
> >   } }
> However when I stream data to it I receive: "value_red_level_out": null,
>
> I have checked in profiler client and here is what I got:
> > [Stellar]>>> PROFILE_GET( "host-talks-to" , "99.191.183.156",
> > PROFILE_FIXED(300, "MINUTES"))
> > [1, 6, 6, 6, 6, 6, 3, 4, 5, 6, 4, 6, 6, 6, 1, 1, 6, 6, 1, 4, 1, 1, 4,
> > 6, 6, 1, 6, 6, 1, 2, 6, 1, 1, 1, 6, 4, 6, 6, 3, 1, 6, 2, 1, 6, 1, 6]
> > [Stellar]>>> STATS_PERCENTILE(STATS_MERGE(
> > PROFILE_GET('host-talks-to', '99.191.183.156', PROFILE_FIXED(10,
> > 'HOURS'))), 90)
> > NaN
> > [Stellar]>>> STATS_MERGE( PROFILE_GET('host-talks-to',
> > '99.191.183.156', PROFILE_FIXED(10, 'HOURS')))
> So the STATS_MERGE produces no results. Is this something expected or I
> made a mistake somewhere? Please advise.
>
>
> p.s. I am following this use cases:
> https://github.com/hortonworks-gallery/metron-rules/tree/master/use-cases/
> DegreeOfHost
> There were number of errors in the configs originally, which I have
> corrected, maybe I missed something else.
>
> - Dima
>
>
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message