metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Casey Stella <ceste...@gmail.com>
Subject Re: Profiler statistics NaN
Date Wed, 09 Aug 2017 20:38:40 GMT
Yeah, I'm leaning toward STATS_ADD or STATS_INIT taking a list of numbers.
STATS_MERGE seems confusing.

On Wed, Aug 9, 2017 at 4:37 PM, Nick Allen <nick@nickallen.org> wrote:

> Or even change the behavior of STATS_MERGE, too?  If STATS_MERGE gets raw
> numbers, it wraps those in a Stats object, then returns it.  Then Dima's
> example would just work as-is.
>
> I'm not sure I like that though.  Maybe so flexible as to be confusing?
> Thought I would throw it out as an alternative to consider.
>
>
>
>
> On Wed, Aug 9, 2017 at 4:31 PM Nick Allen <nick@nickallen.org> wrote:
>
> > Oh yeah, duh.  Now I'm with you. That would be a good quick hit.
> >
> > The current behavior is a little nutty.  If there is a list, it only
> > consumes the first element in the list.  I'd expect that it should either
> > do what you describe or complain that it doesn't know how to handle a
> > list.  Easy fix though.
> >
> > [Stellar]>>> STATS_MEAN(STATS_ADD(null, 1, 2, 3))
> > 2.0
> >
> > [Stellar]>>> STATS_MEAN(STATS_ADD(null, [1,2,3]))
> > 1.0
> >
> > [Stellar]>>> STATS_COUNT(STATS_ADD(null, [1,2,3]))
> > 1.0
> >
> > On Wed, Aug 9, 2017 at 4:17 PM Casey Stella <cestella@gmail.com> wrote:
> >
> >> outcoming is still a HLLP object, not a statistics object, so doing a
> >> STATS_MERGE on a bunch of them wouldn't work either.
> >>
> >> On Wed, Aug 9, 2017 at 4:15 PM, Nick Allen <nick@nickallen.org> wrote:
> >>
> >> > That is another problem.  Isn't the simplest answer, to just change
> >> this...
> >> >
> >> > "result": "HLLP_CARDINALITY(outcoming)"
> >> >
> >> > to this...
> >> >
> >> > "result": "outcoming"
> >> >
> >> > ?
> >> >
> >> > On Wed, Aug 9, 2017 at 3:48 PM Casey Stella <cestella@gmail.com>
> wrote:
> >> >
> >> > > Ok, so the problem here is that your profile is returning integers
> >> > > (specifically HLLP cardinalities) rather than stats objects.  When
> >> you're
> >> > > doing:
> >> > >     STATS_PERCENTILE(STATS_MERGE( PROFILE_GET('host-talks-to',
> >> > > '99.191.183.156', PROFILE_FIXED(10, 'HOURS')), 90)
> >> > > You are calling STATS_MERGE on a list of integers and it takes a
> list
> >> of
> >> > > statistics objects.
> >> > >
> >> > > What you can do instead is:
> >> > >     STATS_PERCENTILE( REDUCE( PROFILE_GET('host-talks-to',
> >> > > '99.191.183.156', PROFILE_FIXED(10, 'HOURS'), (s, x) -> STATS_ADD(s,
> >> x),
> >> > > STATS_INIT()), 90)
> >> > >
> >> > > Ok, that looks horrible, doesn't it?  Well, thankfully we added
> >> temporary
> >> > > variables for stellar enrichments in 0.4.1.  Let's take that
> "numeric"
> >> > > stellar enrichment group and reimagine it.  With temporary
> variables,
> >> you
> >> > > would turn:
> >> > >
> >> > > "numeric" : {
> >> > >              "value_red_level_out": "STATS_PERCENTILE( REDUCE(
> >> > >  PROFILE_GET('host-being-talked-to', ip_src_addr, PROFILE_FIXED(1,
> >> > > 'HOURS')), (s, x) -> STATS_ADD(s, x), STATS_INIT()), 95)",
> >> > >              "value_red_level_in": "STATS_PERCENTILE( REDUCE(
> >> > > PROFILE_GET('host-talks-to',
> >> > > ip_src_addr, PROFILE_FIXED(1, 'HOURS')), (s, x) -> STATS_ADD(s,
x),
> >> > > STATS_INIT()), 95)"
> >> > >            },
> >> > >
> >> > > into:
> >> > > "numeric" : [
> >> > >              "profile_duration := PROFILE_FIXED(1, 'HOURS')",
> >> > >              "host_being_talked_to := PROFILE_GET('host-being-
> >> > talked-to',
> >> > > ip_src_addr, profile_duration)",
> >> > >              "host_talks_to := PROFILE_GET('host-talks-to',
> >> ip_src_addr,
> >> > > profile_duration)",
> >> > >              "host_being_talked_to_stats := REDUCE(
> >> host_being_talked_to,
> >> > > (s, x) -> STATS_ADD(s, x), STATS_INIT())",
> >> > >              "host_talks_to_stats := REDUCE(host_talks_to, (s, x)
->
> >> > > STATS_ADD(s, x), STATS_INIT())",
> >> > >              "value_red_level_out": "STATS_PERCENTILE(
> >> > > host_being_talked_to_stats, 95)",
> >> > >              "value_red_level_in": "STATS_PERCENTILE(
> >> > host_talks_to_stats,
> >> > > 95)",
> >> > >              "profile_duration := null",
> >> > >              "host_being_talked_to := null",
> >> > >              "host_talks_to := null",
> >> > >              "host_being_talked_to_stats := null",
> >> > >              host_talks_to_stats := null"
> >> > >            ],
> >> > >
> >> > > That's a lot more to type, but it allows you to reuse and take the
> >> pieces
> >> > > in chunks.
> >> > >
> >> > > Ok, so now I find myself thinking "a pox on both your houses" since
> >> both
> >> > > examples now kinda look long and convoluted.  So, why are they?
> Well,
> >> > that
> >> > > REDUCE is likely the culprit.  It's supposed to get us out of bad
> >> > > situations not show up in what could be argued is the 80% case.  How
> >> > about,
> >> > > instead, we allow STATS_ADD  or STATS_INIT to take a list of
> >> numbers?  If
> >> > > so, we could pretty easily make that nicer:
> >> > >     STATS_PERCENTILE( STATS_ADD(  PROFILE_GET('host-being-
> talked-to',
> >> > > ip_src_addr, PROFILE_FIXED(1, 'HOURS'))), 95)
> >> > >
> >> > > or
> >> > >      STATS_PERCENTILE( STATS_INIT(
> >> PROFILE_GET('host-being-talked-to',
> >> > > ip_src_addr, PROFILE_FIXED(1, 'HOURS'))), 95)
> >> > >
> >> > >
> >> > > We should make some sort of candy like that so we can avoid some of
> >> the
> >> > > complexity in the normal case.
> >> > >
> >> > > On Wed, Aug 9, 2017 at 3:03 PM, Dima Kovalyov <
> >> Dima.Kovalyov@sstech.us>
> >> > > wrote:
> >> > >
> >> > > > Hello Metron Team,
> >> > > >
> >> > > > I have created following profiler:
> >> > > > > {
> >> > > > >   "profile": "host-talks-to",
> >> > > > >   "onlyif": "exists(source_ip)",
> >> > > > >   "foreach": "source_ip",
> >> > > > >   "init": {
> >> > > > >     "outcoming": "HLLP_INIT(5, 6)"
> >> > > > >           },
> >> > > > >   "update": { "outcoming": "HLLP_ADD(outcoming, destination_ip)"
> >> },
> >> > > > >   "result": "HLLP_CARDINALITY(outcoming)"
> >> > > > > }
> >> > > > I have also created enrichment rule:
> >> > > > > {
> >> > > > >   "enrichment" : {
> >> > > > >     "fieldMap": {
> >> > > > >       "stellar" : {
> >> > > > >         "config" : {
> >> > > > >           "numeric" : {
> >> > > > >             "value_red_level_out": "STATS_PERCENTILE(
> STATS_MERGE(
> >> > > > > PROFILE_GET('host-being-talked-to', ip_src_addr, 1, 'HOURS')),
> >> 95)",
> >> > > > >             "value_red_level_in": "STATS_PERCENTILE(
> STATS_MERGE(
> >> > > > > PROFILE_GET('host-talks-to', ip_src_addr, 1, 'HOURS')),
95)"
> >> > > > >           },
> >> > > > >           "text" : {
> >> > > > >             "is_alert": "true"
> >> > > > >           }
> >> > > > >         }
> >> > > > >       }
> >> > > > >     }
> >> > > > >   } }
> >> > > > However when I stream data to it I receive: "value_red_level_out":
> >> > null,
> >> > > >
> >> > > > I have checked in profiler client and here is what I got:
> >> > > > > [Stellar]>>> PROFILE_GET( "host-talks-to" , "99.191.183.156",
> >> > > > > PROFILE_FIXED(300, "MINUTES"))
> >> > > > > [1, 6, 6, 6, 6, 6, 3, 4, 5, 6, 4, 6, 6, 6, 1, 1, 6, 6, 1,
4, 1,
> >> 1, 4,
> >> > > > > 6, 6, 1, 6, 6, 1, 2, 6, 1, 1, 1, 6, 4, 6, 6, 3, 1, 6, 2,
1, 6,
> 1,
> >> 6]
> >> > > > > [Stellar]>>> STATS_PERCENTILE(STATS_MERGE(
> >> > > > > PROFILE_GET('host-talks-to', '99.191.183.156', PROFILE_FIXED(10,
> >> > > > > 'HOURS'))), 90)
> >> > > > > NaN
> >> > > > > [Stellar]>>> STATS_MERGE( PROFILE_GET('host-talks-to',
> >> > > > > '99.191.183.156', PROFILE_FIXED(10, 'HOURS')))
> >> > > > So the STATS_MERGE produces no results. Is this something expected
> >> or I
> >> > > > made a mistake somewhere? Please advise.
> >> > > >
> >> > > >
> >> > > > p.s. I am following this use cases:
> >> > > >
> >> > > https://github.com/hortonworks-gallery/metron-
> >> > rules/tree/master/use-cases/
> >> > > > DegreeOfHost
> >> > > > There were number of errors in the configs originally, which
I
> have
> >> > > > corrected, maybe I missed something else.
> >> > > >
> >> > > > - Dima
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message