metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zeolla@GMail.com" <zeo...@gmail.com>
Subject Re: [GitHub] incubator-metron pull request #503: METRON-815 sensor-stubs sometimes send m...
Date Mon, 03 Apr 2017 10:28:04 GMT
Bro timestamps are often out of order depending on the log because some
lines are written when the connection ends and others are written when an
event within a connection occurs.  As such, timestamps can be confusing to
look at initially, but it is very normal for them not to be in order.
Also, we are already breaking any sort ordering by randomly selecting logs
from bro.out and replacing the timestamps with the current timestamp, so
I'm not concerned with my changes causing any more of a headache than
flattening the decimal places with 0s.

Jon

On Sun, Apr 2, 2017, 11:50 PM mattf-horton <git@git.apache.org> wrote:

> Github user mattf-horton commented on a diff in the pull request:
>
>
> https://github.com/apache/incubator-metron/pull/503#discussion_r109336484
>
>     --- Diff:
> metron-deployment/roles/sensor-stubs/templates/start-bro-stub ---
>     @@ -47,8 +47,8 @@ TOPIC="bro"
>      while true; do
>
>        # transform the bro timestamp and push to kafka
>     -  SEARCH="\"ts\"\:[0-9]\+.[0-9]\{6\}"
>     -  REPLACE="\"ts\"\:`date +%s`.000000"
>     +  SEARCH="\"ts\"\:[0-9]\+\."
>     +  REPLACE="\"ts\"\:`date +%s`\."
>     --- End diff --
>
>     @JonZeolla , good catch.  Leaving the fractional portion of the
> timestamp the same as it is, is appealing.  However, since the granularity
> of `date +%s` is only seconds, and we might transform a bunch of timestamps
> in one second of wallclock realtime, this may result in apparently
> out-of-order timestamps, no?  Eg, if we start with data whose first three
> records have timestamps:
>     1491190032.222222 1491190032.777777 1491190033.111111
>     The transformed data will have timestamps
>     1491190442.222222 1491190442.777777 1491190442.111111
>     with later ones being (at least potentially) out of order.  The
> original code would have generated
>     1491190442.000000 1491190442.000000 1491190442.000000
>     which is rather monotone, but at least not out of order.
>
>     Is this okay, or potentially bad?
>     Perhaps it would be better to just change the `.[0-9]\{6\}` to
> `\.[0-9]\+` in line 50, and leaving line 51 unchanged?
>     (I'm asking, I don't know.  Maybe bro data can naturally be out of
> order?)
>
>
>
>
> ---
> If your project is set up for it, you can reply to this email and have your
> reply appear on GitHub as well. If your project does not have this feature
> enabled and wishes so, or if the feature is enabled but not working, please
> contact infrastructure at infrastructure@apache.org or file a JIRA ticket
> with INFRA.
> ---
>
-- 

Jon

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message