metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Allen <n...@nickallen.org>
Subject Re: Profiler statistics NaN
Date Wed, 09 Aug 2017 20:37:06 GMT
Or even change the behavior of STATS_MERGE, too?  If STATS_MERGE gets raw
numbers, it wraps those in a Stats object, then returns it.  Then Dima's
example would just work as-is.

I'm not sure I like that though.  Maybe so flexible as to be confusing?
Thought I would throw it out as an alternative to consider.




On Wed, Aug 9, 2017 at 4:31 PM Nick Allen <nick@nickallen.org> wrote:

> Oh yeah, duh.  Now I'm with you. That would be a good quick hit.
>
> The current behavior is a little nutty.  If there is a list, it only
> consumes the first element in the list.  I'd expect that it should either
> do what you describe or complain that it doesn't know how to handle a
> list.  Easy fix though.
>
> [Stellar]>>> STATS_MEAN(STATS_ADD(null, 1, 2, 3))
> 2.0
>
> [Stellar]>>> STATS_MEAN(STATS_ADD(null, [1,2,3]))
> 1.0
>
> [Stellar]>>> STATS_COUNT(STATS_ADD(null, [1,2,3]))
> 1.0
>
> On Wed, Aug 9, 2017 at 4:17 PM Casey Stella <cestella@gmail.com> wrote:
>
>> outcoming is still a HLLP object, not a statistics object, so doing a
>> STATS_MERGE on a bunch of them wouldn't work either.
>>
>> On Wed, Aug 9, 2017 at 4:15 PM, Nick Allen <nick@nickallen.org> wrote:
>>
>> > That is another problem.  Isn't the simplest answer, to just change
>> this...
>> >
>> > "result": "HLLP_CARDINALITY(outcoming)"
>> >
>> > to this...
>> >
>> > "result": "outcoming"
>> >
>> > ?
>> >
>> > On Wed, Aug 9, 2017 at 3:48 PM Casey Stella <cestella@gmail.com> wrote:
>> >
>> > > Ok, so the problem here is that your profile is returning integers
>> > > (specifically HLLP cardinalities) rather than stats objects.  When
>> you're
>> > > doing:
>> > >     STATS_PERCENTILE(STATS_MERGE( PROFILE_GET('host-talks-to',
>> > > '99.191.183.156', PROFILE_FIXED(10, 'HOURS')), 90)
>> > > You are calling STATS_MERGE on a list of integers and it takes a list
>> of
>> > > statistics objects.
>> > >
>> > > What you can do instead is:
>> > >     STATS_PERCENTILE( REDUCE( PROFILE_GET('host-talks-to',
>> > > '99.191.183.156', PROFILE_FIXED(10, 'HOURS'), (s, x) -> STATS_ADD(s,
>> x),
>> > > STATS_INIT()), 90)
>> > >
>> > > Ok, that looks horrible, doesn't it?  Well, thankfully we added
>> temporary
>> > > variables for stellar enrichments in 0.4.1.  Let's take that "numeric"
>> > > stellar enrichment group and reimagine it.  With temporary variables,
>> you
>> > > would turn:
>> > >
>> > > "numeric" : {
>> > >              "value_red_level_out": "STATS_PERCENTILE( REDUCE(
>> > >  PROFILE_GET('host-being-talked-to', ip_src_addr, PROFILE_FIXED(1,
>> > > 'HOURS')), (s, x) -> STATS_ADD(s, x), STATS_INIT()), 95)",
>> > >              "value_red_level_in": "STATS_PERCENTILE( REDUCE(
>> > > PROFILE_GET('host-talks-to',
>> > > ip_src_addr, PROFILE_FIXED(1, 'HOURS')), (s, x) -> STATS_ADD(s, x),
>> > > STATS_INIT()), 95)"
>> > >            },
>> > >
>> > > into:
>> > > "numeric" : [
>> > >              "profile_duration := PROFILE_FIXED(1, 'HOURS')",
>> > >              "host_being_talked_to := PROFILE_GET('host-being-
>> > talked-to',
>> > > ip_src_addr, profile_duration)",
>> > >              "host_talks_to := PROFILE_GET('host-talks-to',
>> ip_src_addr,
>> > > profile_duration)",
>> > >              "host_being_talked_to_stats := REDUCE(
>> host_being_talked_to,
>> > > (s, x) -> STATS_ADD(s, x), STATS_INIT())",
>> > >              "host_talks_to_stats := REDUCE(host_talks_to, (s, x) ->
>> > > STATS_ADD(s, x), STATS_INIT())",
>> > >              "value_red_level_out": "STATS_PERCENTILE(
>> > > host_being_talked_to_stats, 95)",
>> > >              "value_red_level_in": "STATS_PERCENTILE(
>> > host_talks_to_stats,
>> > > 95)",
>> > >              "profile_duration := null",
>> > >              "host_being_talked_to := null",
>> > >              "host_talks_to := null",
>> > >              "host_being_talked_to_stats := null",
>> > >              host_talks_to_stats := null"
>> > >            ],
>> > >
>> > > That's a lot more to type, but it allows you to reuse and take the
>> pieces
>> > > in chunks.
>> > >
>> > > Ok, so now I find myself thinking "a pox on both your houses" since
>> both
>> > > examples now kinda look long and convoluted.  So, why are they?  Well,
>> > that
>> > > REDUCE is likely the culprit.  It's supposed to get us out of bad
>> > > situations not show up in what could be argued is the 80% case.  How
>> > about,
>> > > instead, we allow STATS_ADD  or STATS_INIT to take a list of
>> numbers?  If
>> > > so, we could pretty easily make that nicer:
>> > >     STATS_PERCENTILE( STATS_ADD(  PROFILE_GET('host-being-talked-to',
>> > > ip_src_addr, PROFILE_FIXED(1, 'HOURS'))), 95)
>> > >
>> > > or
>> > >      STATS_PERCENTILE( STATS_INIT(
>> PROFILE_GET('host-being-talked-to',
>> > > ip_src_addr, PROFILE_FIXED(1, 'HOURS'))), 95)
>> > >
>> > >
>> > > We should make some sort of candy like that so we can avoid some of
>> the
>> > > complexity in the normal case.
>> > >
>> > > On Wed, Aug 9, 2017 at 3:03 PM, Dima Kovalyov <
>> Dima.Kovalyov@sstech.us>
>> > > wrote:
>> > >
>> > > > Hello Metron Team,
>> > > >
>> > > > I have created following profiler:
>> > > > > {
>> > > > >   "profile": "host-talks-to",
>> > > > >   "onlyif": "exists(source_ip)",
>> > > > >   "foreach": "source_ip",
>> > > > >   "init": {
>> > > > >     "outcoming": "HLLP_INIT(5, 6)"
>> > > > >           },
>> > > > >   "update": { "outcoming": "HLLP_ADD(outcoming, destination_ip)"
>> },
>> > > > >   "result": "HLLP_CARDINALITY(outcoming)"
>> > > > > }
>> > > > I have also created enrichment rule:
>> > > > > {
>> > > > >   "enrichment" : {
>> > > > >     "fieldMap": {
>> > > > >       "stellar" : {
>> > > > >         "config" : {
>> > > > >           "numeric" : {
>> > > > >             "value_red_level_out": "STATS_PERCENTILE( STATS_MERGE(
>> > > > > PROFILE_GET('host-being-talked-to', ip_src_addr, 1, 'HOURS')),
>> 95)",
>> > > > >             "value_red_level_in": "STATS_PERCENTILE( STATS_MERGE(
>> > > > > PROFILE_GET('host-talks-to', ip_src_addr, 1, 'HOURS')), 95)"
>> > > > >           },
>> > > > >           "text" : {
>> > > > >             "is_alert": "true"
>> > > > >           }
>> > > > >         }
>> > > > >       }
>> > > > >     }
>> > > > >   } }
>> > > > However when I stream data to it I receive: "value_red_level_out":
>> > null,
>> > > >
>> > > > I have checked in profiler client and here is what I got:
>> > > > > [Stellar]>>> PROFILE_GET( "host-talks-to" , "99.191.183.156",
>> > > > > PROFILE_FIXED(300, "MINUTES"))
>> > > > > [1, 6, 6, 6, 6, 6, 3, 4, 5, 6, 4, 6, 6, 6, 1, 1, 6, 6, 1, 4,
1,
>> 1, 4,
>> > > > > 6, 6, 1, 6, 6, 1, 2, 6, 1, 1, 1, 6, 4, 6, 6, 3, 1, 6, 2, 1, 6,
1,
>> 6]
>> > > > > [Stellar]>>> STATS_PERCENTILE(STATS_MERGE(
>> > > > > PROFILE_GET('host-talks-to', '99.191.183.156', PROFILE_FIXED(10,
>> > > > > 'HOURS'))), 90)
>> > > > > NaN
>> > > > > [Stellar]>>> STATS_MERGE( PROFILE_GET('host-talks-to',
>> > > > > '99.191.183.156', PROFILE_FIXED(10, 'HOURS')))
>> > > > So the STATS_MERGE produces no results. Is this something expected
>> or I
>> > > > made a mistake somewhere? Please advise.
>> > > >
>> > > >
>> > > > p.s. I am following this use cases:
>> > > >
>> > > https://github.com/hortonworks-gallery/metron-
>> > rules/tree/master/use-cases/
>> > > > DegreeOfHost
>> > > > There were number of errors in the configs originally, which I have
>> > > > corrected, maybe I missed something else.
>> > > >
>> > > > - Dima
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > >
>> >
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message