metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raghu Mitra Kandikonda <r...@hortonworks.com>
Subject Re: [DISCUSS] Unique id for messages
Date Mon, 13 Mar 2017 14:27:46 GMT
Thanks for the direction :) will work on this and update the thread.

-Raghu




On 10/03/17, 10:47 PM, "Casey Stella" <cestella@gmail.com> wrote:

>Yes, we do use a UUID in the enrichment topology; this is our message join
>key on the join portion of the split/join enrichment.  The logic being used
>is EnrichmentSplitterBolt.java  line 63.
>
>We might bring that out and make it part of the message IMO and be able to
>reuse that unique identifier in the enrichment topology.
>
>On Fri, Mar 10, 2017 at 10:51 AM, Zeolla@GMail.com <zeolla@gmail.com> wrote:
>
>> I definitely think that this is a valuable discussion.  I seem to recall
>> cstella mentioning at some point in the past that there is a UUID already
>> used in storm that we might be able to expose into the message itself, but
>> I could be wrong.
>>
>> For additional context regarding prior discussions, this was also briefly
>> discussed in another topic here here
>> <https://lists.apache.org/thread.html/b039f0f0a5e6cfaf30944dc768088e
>> 1e1bd5dae4b2247dda12698805@%3Cdev.metron.apache.org%3E>.
>> In that context I was hoping to be able to link messages across all
>> indexing destinations (HDFS, ES, Solr, etc.).
>>
>> On Fri, Mar 10, 2017 at 9:26 AM Raghu Mitra Kandikonda <
>> rksv@hortonworks.com>
>> wrote:
>>
>> > Hi All,
>> >
>> > I would like to start a discussion around adding a unique id to all the
>> > parsed messages.  I feel there  was  a discussion around a similar topic
>> > but I am not sure as a community we agreed on a proposal.
>> >
>> > We could
>> > -use a random number generator like UUID but this might have performance
>> > implications
>> > -use a kafka topic name + systemtime + Kafka message offset to generate a
>> > unique identifier
>> > -use the input message to generate a hashcode
>> >
>> > Any thoughts ?
>> >
>> > (Attached email that had similar discussion for error indexing)
>> >
>> > Regards,
>> > RaghuM
>> >
>> >
>> >
>> > ---------- Forwarded message ----------
>> > From: "Zeolla@GMail.com" <zeolla@gmail.com>
>> > To: "dev@metron.incubator.apache.org" <dev@metron.incubator.apache.org>
>> > Cc:
>> > Bcc:
>> > Date: Wed, 1 Feb 2017 22:18:12 +0000
>> > Subject: Re: [DISCUSS] Error Indexing
>> > Simply as a unique identifier of the original information which is
>> failing
>> > some step, and thus giving you something to key in on and create a count
>> of
>> > unique events and prioritize issues without the concern of cyclical
>> issues
>> > (if the issue is with indexing a specific message, and you try to index
>> it
>> > again, it will just fail in a loop).
>> >
>> > Jon
>> >
>> > On Wed, Feb 1, 2017 at 6:59 AM Dima Kovalyov <Dima.Kovalyov@sstech.us>
>> > wrote:
>> >
>> > > That's a great topic of discussion.
>> > >
>> > > Throughout the thread the idea of having hash of the message that
>> failed
>> > > is changed, can someone please explain why do you plan to use this hash
>> > > and how?
>> > >
>> > > - Dima
>> > >
>> > > On 02/01/2017 06:23 AM, Zeolla@GMail.com wrote:
>> > > > After thinking on this for a few days I recant my previous suggestion
>> > of
>> > > > TupleHash256.  It's still a bit early for SHA-3 - no good reference
>> > > > implementations/libraries exist (I did some searching and emailing),
>> it
>> > > is
>> > > > optimized for hardware but no hardware implementation is widely
>> > > accessible,
>> > > > FIPS 140-3 is still not close to finalized, etc.
>> > > >
>> > > > I think we could simulate the benefits of tuplehash by sorting the
>> > > tuples,
>> > > > then doing SHA-256(len(tuple1) | tuple1 | ... | len(tuplen) |
>> tuplen).
>> > > > Happy to entertain opposing thoughts, such as BLAKE2, etc. but with
>> the
>> > > > likely users of Metron, I think sticking with FIPS 140-2 is a solid
>> > > choice.
>> > > >
>> > > > Jon
>> > > >
>> > > > On Thu, Jan 26, 2017, 11:23 AM Zeolla@GMail.com <zeolla@gmail.com>
>> > > wrote:
>> > > >
>> > > > So one more thing regarding why I think we should throw an exception
>> > on a
>> > > > failed enrichment.  If we do make something like username a constant
>> > > field,
>> > > > in cases where that is used to calculate rawMessage_hash, if it fails
>> > to
>> > > > enrich, the hash would be different compared to when it succeeds.
 Of
>> > > > course I think the initial intent of adding username as a constant
>> > field
>> > > > would be to handle it in the parsers, where that information is
>> > provided
>> > > in
>> > > > the messages themselves, but how would Threat Intel know the
>> > difference?
>> > > > In my environment I am looking forward to a streaming enrichment that
>> > > adds
>> > > > the username, where applicable, anywhere I have an IP.
>> > > >
>> > > > My hesitant suggestion for a hashing algorithm would be to use
>> > > > TupleHash256, as it is a NIST-provided implementation of SHA-3 (using
>> > > > cSHAKE) for this use case.  Details here
>> > > > <
>> > > http://nvlpubs.nist.gov/nistpubs/specialpublications/
>> nist.sp.800-185.pdf
>> > >.
>> > > > However, I haven't been able to find a reference implementation of
>> this
>> > > in
>> > > > any language, so that's a bit of a downside.  A more general SHA3-256
>> > > > implementation where we handle ordering could work as well, but would
>> > be
>> > > > significantly less optimal.
>> > > >
>> > > > Jon
>> > > >
>> > > > On Thu, Jan 26, 2017 at 10:20 AM Ryan Merriman <merrimanr@gmail.com>
>> > > wrote:
>> > > >
>> > > > Jon, I misread the code in the GenericEnrichmentBolt.  The error is
>> > > > forwarded on so no issues there.
>> > > >
>> > > > Defaulting to the common fields makes sense.  I will dig into the
>> > > > GenericEnrichmentBolt more, maybe there is a way to get the error
>> > fields
>> > > > without having to significantly change things.  Any opinion on a
>> > hashing
>> > > > algorithm?
>> > > >
>> > > > On Wed, Jan 25, 2017 at 9:37 PM, Zeolla@GMail.com <zeolla@gmail.com>
>> > > wrote:
>> > > >
>> > > >> Although hashing the whole message is better than nothing, it
>> misses a
>> > > lot
>> > > >> of the benefits we could get.
>> > > >>
>> > > >> While I'd love to have consistency for this field across all of
the
>> > > >> different error.types, it appears that may not be reasonably
>> possible
>> > > >> because of the parsers.  So, how about something like hash all
of
>> the
>> > > >> constant
>> > > >> fields
>> > > >> <https://github.com/apache/incubator-metron/blob/master/
>> > > >> metron-platform/metron-common/src/main/java/org/apache/
>> > > >> metron/common/Constants.java>
>> > > >> excluding
>> > > >> timestamp and original_string unless it is a parser, in which
case
>> > hash
>> > > > the
>> > > >> entire message?  This gives us some measure of event uniqueness
and
>> it
>> > > can
>> > > >> grow as we define additional constant fields (I recall discussing
>> with
>> > > >> someone else on the list regarding expanding those standard fields
>> to
>> > > >> include things like usernames but I can't find the specific email
>> > > >> exchange).
>> > > >>
>> > > >> Because some enrichments can be heavily relied on, I think it
makes
>> > > sense
>> > > >> to put a message onto the error queue when it throws an exception.
>> > Not
>> > > >> only does this help troubleshoot edge cases, but it makes issues
>> more
>> > > >> obvious when assembling a new enrichment in dev/test.  I can't
think
>> > of
>> > > a
>> > > >> scenario currently where an enrichment would only be "best effort"
>> and
>> > > > that
>> > > >> I wouldn't want that error indexed and retrievable.  However,
this
>> > gets
>> > > >> interesting when talking about the various options to solve the
>> > "Enrich
>> > > >> enrichment" discussion from earlier in the month.  We can keep
that
>> > part
>> > > > of
>> > > >> this separate though, as I don't think that's being actively pursued
>> > > right
>> > > >> now.
>> > > >>
>> > > >> Jon
>> > > >>
>> > > >> On Wed, Jan 25, 2017 at 10:49 AM David Lyle <dlyle65535@gmail.com>
>> > > wrote:
>> > > >>
>> > > >> RE: separate JIRA for MPack/Ansible. No objection to tracking
them
>> > > >> separately, but for this item to be complete, you'll need both
the
>> > > feature
>> > > >> and the ability to install it.
>> > > >>
>> > > >> -D...
>> > > >>
>> > > >>
>> > > >> On Tue, Jan 24, 2017 at 5:33 PM, Ryan Merriman <merrimanr@gmail.com
>> >
>> > > >> wrote:
>> > > >>
>> > > >>> Assuming we're going to write all errors to a single error
topic, I
>> > > > think
>> > > >>> it makes sense to agree on an error message schema and handle
>> errors
>> > > >> across
>> > > >>> the 3 different topologies in the same way with a single
>> > > implementation.
>> > > >>> The implementation in ParserBolt (ErrorUtils.handleError)
produces
>> > the
>> > > >> most
>> > > >>> verbose error object so I think it's a good candidate for
the
>> single
>> > > >>> implementation.  Here is the message structure it currently
>> produces:
>> > > >>>
>> > > >>> {
>> > > >>>   "exception": "java.lang.Exception: there was an error",
>> > > >>>   "hostname": "host",
>> > > >>>   "stack": "java.lang.Exception: ...",
>> > > >>>   "time": 1485295416563,
>> > > >>>   "message": "there was an error",
>> > > >>>   "rawMessage": "raw message",
>> > > >>>   "rawMessage_bytes": [],
>> > > >>>   "source.type": "bro_error"
>> > > >>> }
>> > > >>>
>> > > >>> From our discussion so far we need to add a couple fields:
 an
>> error
>> > > > type
>> > > >>> and hash id.  Adding these to the message looks like:
>> > > >>>
>> > > >>> {
>> > > >>>   "exception": "java.lang.Exception: there was an error",
>> > > >>>   "hostname": "host",
>> > > >>>   "stack": "java.lang.Exception: ...",
>> > > >>>   "time": 1485295416563,
>> > > >>>   "message": "there was an error",
>> > > >>>   "rawMessage": "raw message",
>> > > >>>   "rawMessage_bytes": [],
>> > > >>>   "source.type": "bro_error",
>> > > >>>   "error.type": "parser_error",
>> > > >>>   "rawMessage_hash": "dde41b9920954f94066daf6291fb58a9"
>> > > >>> }
>> > > >>>
>> > > >>> We should also consider expanding the error types I listed
earlier.
>> > > >>> Instead of just having "indexing_error" we could have
>> > > >>> "elasticsearch_indexing_error", "hdfs_indexing_error" and
so on.
>> > > >>>
>> > > >>> Jon, if an exception happens in an enrichment or threat intel
bolt
>> > the
>> > > >>> message is passed along with no error thrown (only logged).
>> > Everywhere
>> > > >>> else I'm having trouble identifying specific fields that should
be
>> > > >> hashed.
>> > > >>> Would hashing the message in every case be acceptable?  Do
you know
>> > of
>> > > a
>> > > >>> place where we could hash a field instead?  On the topic of
>> > exceptions
>> > > > in
>> > > >>> enrichments, are we ok with an error only being logged and
not
>> added
>> > to
>> > > >> the
>> > > >>> message or emitted to the error queue?
>> > > >>>
>> > > >>>
>> > > >>>
>> > > >>> On Tue, Jan 24, 2017 at 3:10 PM, Ryan Merriman <
>> merrimanr@gmail.com>
>> > > >>> wrote:
>> > > >>>
>> > > >>>> That use case makes sense to me.  I don't think it will
require
>> that
>> > > >> much
>> > > >>>> additional effort either.
>> > > >>>>
>> > > >>>> On Tue, Jan 24, 2017 at 1:02 PM, Zeolla@GMail.com <
>> zeolla@gmail.com
>> > >
>> > > >>>> wrote:
>> > > >>>>
>> > > >>>>> Regarding error vs validation - Either way I'm not
very
>> > concerned.  I
>> > > >>>>> initially assumed they would be combined and agree
with that
>> > > > approach,
>> > > >>> but
>> > > >>>>> splitting them out isn't a very big deal to me either.
>> > > >>>>>
>> > > >>>>> Re: Ryan.  Yes, exactly.  In the case of a parser
issue (or
>> > anywhere
>> > > >>> else
>> > > >>>>> where it's not possible to pick out the exact thing
causing the
>> > > > issue)
>> > > >>> it
>> > > >>>>> would be a hash of the complete message.
>> > > >>>>>
>> > > >>>>> Regarding the architecture, I mostly agree with James
except
>> that I
>> > > >>> think
>> > > >>>>> step 3 needs to also be able to somehow group errors
via the
>> > original
>> > > >>>>> data (identify
>> > > >>>>> replays, identify repeat issues with data in a specific
field,
>> > issues
>> > > >>> with
>> > > >>>>> consistently different data, etc.).  This is essentially
the
>> first
>> > > >> step
>> > > >>> of
>> > > >>>>> troubleshooting, which I assume you are doing if you're
looking
>> at
>> > > > the
>> > > >>>>> error dashboard.
>> > > >>>>>
>> > > >>>>> If the hash gets moved out of the initial implementation,
I'm
>> > fairly
>> > > >>>>> certain you lose this ability.  The point here isn't
to handle
>> long
>> > > >>> fields
>> > > >>>>> (although that's a benefit of this approach), it's
to attach a
>> > unique
>> > > >>>>> identifier to the error/validation issue message that
links it to
>> > the
>> > > >>>>> original problem.  I'd be happy to consider alternative
solutions
>> > to
>> > > >>> this
>> > > >>>>> problem (for instance, actually sending across the
data itself) I
>> > > > just
>> > > >>>>> haven't been able to think of another way to do this
that I like
>> > > >> better.
>> > > >>>>> Jon
>> > > >>>>>
>> > > >>>>> On Tue, Jan 24, 2017 at 1:13 PM Ryan Merriman <
>> merrimanr@gmail.com
>> > >
>> > > >>>>> wrote:
>> > > >>>>>
>> > > >>>>>> We also need a JIRA for any install/Ansible/MPack
work needed.
>> > > >>>>>>
>> > > >>>>>> On Tue, Jan 24, 2017 at 12:06 PM, James Sirota
<
>> > jsirota@apache.org>
>> > > >>>>> wrote:
>> > > >>>>>>> Now that I had some time to think about it
I would collapse all
>> > > >>> error
>> > > >>>>> and
>> > > >>>>>>> validation topics into one.  We can differentiate
between
>> > > >> different
>> > > >>>>> views
>> > > >>>>>>> of the data (split by error source etc) via
Kibana
>> dashboards.  I
>> > > >>>>> would
>> > > >>>>>>> implement this feature incrementally.  First
I would modify all
>> > > >> the
>> > > >>>>> bolts
>> > > >>>>>>> to log to a single topic.  Second, I would
get the error
>> indexing
>> > > >>>>> done by
>> > > >>>>>>> attaching the indexing topology to the error
topic. Third I
>> would
>> > > >>>>> create
>> > > >>>>>>> the necessary dashboards to view errors and
validation failures
>> > > > by
>> > > >>>>>> source.
>> > > >>>>>>> Lastly, I would file a follow-on JIRA to introduce
hashing of
>> > > >> errors
>> > > >>>>> or
>> > > >>>>>>> fields that are too long.  It seems like a
separate feature
>> that
>> > > >> we
>> > > >>>>> need
>> > > >>>>>> to
>> > > >>>>>>> think through.  We may need a stellar function
around that.
>> > > >>>>>>>
>> > > >>>>>>> Thanks,
>> > > >>>>>>> James
>> > > >>>>>>>
>> > > >>>>>>> 24.01.2017, 10:25, "Ryan Merriman" <merrimanr@gmail.com>:
>> > > >>>>>>>> I understand what Jon is talking about.
He's proposing we hash
>> > > >> the
>> > > >>>>>> value
>> > > >>>>>>>> that caused the error, not necessarily
the error message
>> > > > itself.
>> > > >>>>> For an
>> > > >>>>>>>> enrichment this is easy. Just pass along
the field value that
>> > > >>> failed
>> > > >>>>>>>> enrichment. For other cases the field
that caused the error
>> may
>> > > >>> not
>> > > >>>>> be
>> > > >>>>>> so
>> > > >>>>>>>> obvious. Take parser validation for example.
The message is
>> > > >>>>> validated
>> > > >>>>>> as
>> > > >>>>>>>> a whole and it may not be easy to determine
which field is the
>> > > >>>>> cause.
>> > > >>>>>> In
>> > > >>>>>>>> that case would a hash of the whole message
work?
>> > > >>>>>>>>
>> > > >>>>>>>> There is a broader architectural discussion
that needs to
>> > > > happen
>> > > >>>>> before
>> > > >>>>>>> we
>> > > >>>>>>>> can implement this. Currently we have
an indexing topology
>> that
>> > > >>>>> reads
>> > > >>>>>>> from
>> > > >>>>>>>> 1 topic and writes messages to ES but
errors are written to
>> > > >>> several
>> > > >>>>>>>> different topics:
>> > > >>>>>>>>
>> > > >>>>>>>>    - parser_error
>> > > >>>>>>>>    - parser_invalid
>> > > >>>>>>>>    - enrichments_error
>> > > >>>>>>>>    - threatintel_error
>> > > >>>>>>>>    - indexing_error
>> > > >>>>>>>>
>> > > >>>>>>>> I can see 4 possible approaches to implementing
this:
>> > > >>>>>>>>
>> > > >>>>>>>>    1. Create an index topology for each
error topic
>> > > >>>>>>>>       1. Good because we can easily reuse
the indexing
>> topology
>> > > >>> and
>> > > >>>>>> would
>> > > >>>>>>>>       require the least development effort
>> > > >>>>>>>>       2. Bad because it would consume
a lot of extra worker
>> > > >> slots
>> > > >>>>>>>>    2. Move the topic name into the error
JSON message as a new
>> > > >>>>>>> "error_type"
>> > > >>>>>>>>    field and write all messages to the
indexing topic
>> > > >>>>>>>>       1. Good because we don't need to
create a new topology
>> > > >>>>>>>>       2. Bad because we would be flowing
data and errors
>> > > > through
>> > > >>> the
>> > > >>>>>> same
>> > > >>>>>>>>       topology. A spike in errors could
affect message
>> > > > indexing.
>> > > >>>>>>>>    3. Compromise between 1 and 2. Create
another indexing
>> > > >> topology
>> > > >>>>> that
>> > > >>>>>>> is
>> > > >>>>>>>>    dedicated to indexing errors. Move
the topic name into the
>> > > >>> error
>> > > >>>>>> JSON
>> > > >>>>>>>>    message as a new "error_type" field
and write all errors to
>> > > > a
>> > > >>>>> single
>> > > >>>>>>> error
>> > > >>>>>>>>    topic.
>> > > >>>>>>>>    4. Write a completely new topology
with multiple spouts (1
>> > > >> for
>> > > >>>>> each
>> > > >>>>>>>>    error type listed above) that all feed
into a single
>> > > >>>>>>> BulkMessageWriterBolt.
>> > > >>>>>>>>       1. Good because the current topologies
would not need to
>> > > >>>>> change
>> > > >>>>>>>>       2. Bad because it would require
the most development
>> > > >> effort,
>> > > >>>>>> would
>> > > >>>>>>>>       not reuse existing topologies and
takes up more worker
>> > > >> slots
>> > > >>>>>> than 3
>> > > >>>>>>>> Are there other approaches I haven't thought
of? I think 1 and
>> > > > 2
>> > > >>> are
>> > > >>>>>> off
>> > > >>>>>>>> the table because they are shortcuts and
not good long-term
>> > > >>>>> solutions.
>> > > >>>>>> 3
>> > > >>>>>>>> would be my choice because it introduces
less complexity than
>> > > > 4.
>> > > >>>>>>> Thoughts?
>> > > >>>>>>>> Ryan
>> > > >>>>>>>>
>> > > >>>>>>>> On Mon, Jan 23, 2017 at 5:44 PM, Zeolla@GMail.com
<
>> > > >>> zeolla@gmail.com
>> > > >>>>>>> wrote:
>> > > >>>>>>>>>  In that case the hash would be of
the value in the IP field,
>> > > >>> such
>> > > >>>>> as
>> > > >>>>>>>>>  sha3(8.8.8.8).
>> > > >>>>>>>>>
>> > > >>>>>>>>>  Jon
>> > > >>>>>>>>>
>> > > >>>>>>>>>  On Mon, Jan 23, 2017, 6:41 PM James
Sirota <
>> > > >> jsirota@apache.org>
>> > > >>>>>> wrote:
>> > > >>>>>>>>>  > Jon,
>> > > >>>>>>>>>  >
>> > > >>>>>>>>>  > I am still not entirely following
why we would want to use
>> > > >>>>> hashing.
>> > > >>>>>>> For
>> > > >>>>>>>>>  > example if my error is "Your
IP field is invalid and
>> failed
>> > > >>>>>>> validation"
>> > > >>>>>>>>>  > hashing this error string will
always result in the same
>> > > >> hash.
>> > > >>>>> Why
>> > > >>>>>>> not
>> > > >>>>>>>>>  > just use the actual error string?
Can you provide an
>> > > > example
>> > > >>>>> where
>> > > >>>>>>> you
>> > > >>>>>>>>>  > would use it?
>> > > >>>>>>>>>  >
>> > > >>>>>>>>>  > Thanks,
>> > > >>>>>>>>>  > James
>> > > >>>>>>>>>  >
>> > > >>>>>>>>>  > 23.01.2017, 16:29, "Zeolla@GMail.com"
<zeolla@gmail.com>:
>> > > >>>>>>>>>  > > For 1 - I'm good with that.
>> > > >>>>>>>>>  > >
>> > > >>>>>>>>>  > > I'm talking about hashing
the relevant content itself
>> not
>> > > >>> the
>> > > >>>>>>> error.
>> > > >>>>>>>>>  Some
>> > > >>>>>>>>>  > > benefits are (1) minimize
load on search index (there's
>> > > >>>>> minimal
>> > > >>>>>>> benefit
>> > > >>>>>>>>>  > in
>> > > >>>>>>>>>  > > spending the CPU and disk
to keep it at full fidelity
>> > > >>>>> (tokenize
>> > > >>>>>> and
>> > > >>>>>>>>>  > store))
>> > > >>>>>>>>>  > > (2) provide something to
key on for dashboards (assuming
>> > > > a
>> > > >>>>> good
>> > > >>>>>>> hash
>> > > >>>>>>>>>  > > algorithm that avoids collisions
and is second preimage
>> > > >>>>>> resistant)
>> > > >>>>>>> and
>> > > >>>>>>>>>  > (3)
>> > > >>>>>>>>>  > > specific to errors, if
the issue is that it failed to
>> > > >>> index, a
>> > > >>>>>> hash
>> > > >>>>>>>>>  gives
>> > > >>>>>>>>>  > > us some protection that
the issue will not occur twice.
>> > > >>>>>>>>>  > >
>> > > >>>>>>>>>  > > Jon
>> > > >>>>>>>>>  > >
>> > > >>>>>>>>>  > > On Mon, Jan 23, 2017, 2:47
PM James Sirota <
>> > > >>>>> jsirota@apache.org>
>> > > >>>>>>> wrote:
>> > > >>>>>>>>>  > >
>> > > >>>>>>>>>  > > Jon,
>> > > >>>>>>>>>  > >
>> > > >>>>>>>>>  > > With regards to 1, collapsing
to a single dashboard for
>> > > >> each
>> > > >>>>>> would
>> > > >>>>>>> be
>> > > >>>>>>>>>  > > fine. So we would have
one error index and one "failed
>> to
>> > > >>>>>> validate"
>> > > >>>>>>>>>  > > index. The distinction
is that errors would be things
>> > > > that
>> > > >>>>> went
>> > > >>>>>>> wrong
>> > > >>>>>>>>>  > > during stream processing
(failed to parse, etc...),
>> while
>> > > >>>>>>> validation
>> > > >>>>>>>>>  > > failures are messages that
explicitly failed stellar
>> > > >>>>>>> validation/schema
>> > > >>>>>>>>>  > > enforcement. There should
be relatively few of the
>> second
>> > > >>>>> type.
>> > > >>>>>>>>>  > >
>> > > >>>>>>>>>  > > With respect to 3, why
do you want the error hashed? Why
>> > > >> not
>> > > >>>>> just
>> > > >>>>>>>>>  search
>> > > >>>>>>>>>  > > for the error text?
>> > > >>>>>>>>>  > >
>> > > >>>>>>>>>  > > Thanks,
>> > > >>>>>>>>>  > > James
>> > > >>>>>>>>>  > >
>> > > >>>>>>>>>  > > 20.01.2017, 14:01, "Zeolla@GMail.com"
<zeolla@gmail.com
>> >:
>> > > >>>>>>>>>  > >> As someone who currently
fills the platform engineer
>> > > >> role,
>> > > >>> I
>> > > >>>>> can
>> > > >>>>>>> give
>> > > >>>>>>>>>  > this
>> > > >>>>>>>>>  > >> idea a huge +1. My
thoughts:
>> > > >>>>>>>>>  > >>
>> > > >>>>>>>>>  > >> 1. I think it depends
on exactly what data is pushed
>> > > > into
>> > > >>> the
>> > > >>>>>>> index
>> > > >>>>>>>>>  > (#3).
>> > > >>>>>>>>>  > >> However, assuming the
errors you proposed recording, I
>> > > >>> can't
>> > > >>>>> see
>> > > >>>>>>> huge
>> > > >>>>>>>>>  > >> benefits to having
more than one dashboard. I would be
>> > > >>> happy
>> > > >>>>> to
>> > > >>>>>> be
>> > > >>>>>>>>>  > >> persuaded otherwise.
>> > > >>>>>>>>>  > >>
>> > > >>>>>>>>>  > >> 2. I would say yes,
storing the errors in HDFS in
>> > > >> addition
>> > > >>> to
>> > > >>>>>>>>>  indexing
>> > > >>>>>>>>>  > is
>> > > >>>>>>>>>  > >> a good thing. Using
METRON-510
>> > > >>>>>>>>>  > >> <https://issues.apache.org/jira/browse/METRON-510>
as
>> a
>> > > >>> case
>> > > >>>>>>> study,
>> > > >>>>>>>>>  > there
>> > > >>>>>>>>>  > >> is the potential in
this environment for
>> > > >>> attacker-controlled
>> > > >>>>>> data
>> > > >>>>>>> to
>> > > >>>>>>>>>  > >
>> > > >>>>>>>>>  > > result
>> > > >>>>>>>>>  > >> in processing errors
which could be a method of evading
>> > > >>>>> security
>> > > >>>>>>>>>  > >> monitoring. Once an
attack is identified, the long term
>> > > >>> HDFS
>> > > >>>>>>> storage
>> > > >>>>>>>>>  > would
>> > > >>>>>>>>>  > >> allow better historical
analysis for
>> > > >>> low-and-slow/persistent
>> > > >>>>>>> attacks
>> > > >>>>>>>>>  > (I'm
>> > > >>>>>>>>>  > >> thinking of a method
of data exfil that also won't
>> > > >>>>> successfully
>> > > >>>>>>> get
>> > > >>>>>>>>>  > stored
>> > > >>>>>>>>>  > >> in Lucene, but is hard
to identify over a short period
>> > > > of
>> > > >>>>> time).
>> > > >>>>>>>>>  > >> - Along this line,
I think that there are various parts
>> > > >> of
>> > > >>>>>> Metron
>> > > >>>>>>>>>  > (this
>> > > >>>>>>>>>  > >> included) which could
benefit from having method of
>> > > >>>>> configuring
>> > > >>>>>>> data
>> > > >>>>>>>>>  > aging
>> > > >>>>>>>>>  > >> by bucket in HDFS (Following
Nick's comments here
>> > > >>>>>>>>>  > >> <https://issues.apache.org/jira/browse/METRON-477>).
>> > > >>>>>>>>>  > >>
>> > > >>>>>>>>>  > >> 3. I would potentially
add a hash of the content that
>> > > >>> failed
>> > > >>>>>>>>>  > validation to
>> > > >>>>>>>>>  > >> help identify repeats
over time with less of a concern
>> > > >> that
>> > > >>>>>> you'd
>> > > >>>>>>>>>  have
>> > > >>>>>>>>>  > >
>> > > >>>>>>>>>  > > back
>> > > >>>>>>>>>  > >> to back failures (i.e.
instead of storing the value
>> > > >>> itself).
>> > > >>>>>>>>>  > Additionally,
>> > > >>>>>>>>>  > >> I think it's helpful
to be able to search all times
>> > > > there
>> > > >>>>> was an
>> > > >>>>>>>>>  > indexing
>> > > >>>>>>>>>  > >> error (instead of it
hitting the catch-all).
>> > > >>>>>>>>>  > >>
>> > > >>>>>>>>>  > >> Jon
>> > > >>>>>>>>>  > >>
>> > > >>>>>>>>>  > >> On Fri, Jan 20, 2017
at 1:17 PM James Sirota <
>> > > >>>>>> jsirota@apache.org>
>> > > >>>>>>>>>  > wrote:
>> > > >>>>>>>>>  > >>
>> > > >>>>>>>>>  > >> We already have a capability
to capture bolt errors and
>> > > >>>>>> validation
>> > > >>>>>>>>>  > errors
>> > > >>>>>>>>>  > >> and pipe them into
a Kafka topic. I want to propose
>> that
>> > > >> we
>> > > >>>>>>> attach a
>> > > >>>>>>>>>  > >> writer topology to
the error and validation failed
>> kafka
>> > > >>>>> topics
>> > > >>>>>> so
>> > > >>>>>>>>>  > that we
>> > > >>>>>>>>>  > >> can (a) create a new
ES index for these errors and (b)
>> > > >>>>> create a
>> > > >>>>>>> new
>> > > >>>>>>>>>  > Kibana
>> > > >>>>>>>>>  > >> dashboard to visualize
them. The benefit would be that
>> > > >>> errors
>> > > >>>>>> and
>> > > >>>>>>>>>  > >> validation failures
would be easier to see and analyze.
>> > > >>>>>>>>>  > >>
>> > > >>>>>>>>>  > >> I am seeking feedback
on the following:
>> > > >>>>>>>>>  > >>
>> > > >>>>>>>>>  > >> - How granular would
we want this feature to be? Think
>> > > > we
>> > > >>>>> would
>> > > >>>>>>> want
>> > > >>>>>>>>>  > one
>> > > >>>>>>>>>  > >> index/dashboard per
source? Or would it be better to
>> > > >>> collapse
>> > > >>>>>>>>>  > everything
>> > > >>>>>>>>>  > >> into the same index?
>> > > >>>>>>>>>  > >> - Do we care about
storing these errors in HDFS as
>> well?
>> > > >> Or
>> > > >>>>> is
>> > > >>>>>>>>>  indexing
>> > > >>>>>>>>>  > >> them enough?
>> > > >>>>>>>>>  > >> - What types of errors
should we record? I am
>> proposing:
>> > > >>>>>>>>>  > >>
>> > > >>>>>>>>>  > >> For error reporting:
>> > > >>>>>>>>>  > >> --Message failed to
parse
>> > > >>>>>>>>>  > >> --Enrichment failed
to enrich
>> > > >>>>>>>>>  > >> --Threat intel feed
failures
>> > > >>>>>>>>>  > >> --Generic catch-all
for all other errors
>> > > >>>>>>>>>  > >>
>> > > >>>>>>>>>  > >> For validation reporting:
>> > > >>>>>>>>>  > >> --What part of message
failed validation
>> > > >>>>>>>>>  > >> --What stellar validator
caused the failure
>> > > >>>>>>>>>  > >>
>> > > >>>>>>>>>  > >> -------------------
>> > > >>>>>>>>>  > >> Thank you,
>> > > >>>>>>>>>  > >>
>> > > >>>>>>>>>  > >> James Sirota
>> > > >>>>>>>>>  > >> PPMC- Apache Metron
(Incubating)
>> > > >>>>>>>>>  > >> jsirota AT apache DOT
org
>> > > >>>>>>>>>  > >>
>> > > >>>>>>>>>  > >> --
>> > > >>>>>>>>>  > >>
>> > > >>>>>>>>>  > >> Jon
>> > > >>>>>>>>>  > >>
>> > > >>>>>>>>>  > >> Sent from my mobile
device
>> > > >>>>>>>>>  > >
>> > > >>>>>>>>>  > > -------------------
>> > > >>>>>>>>>  > > Thank you,
>> > > >>>>>>>>>  > >
>> > > >>>>>>>>>  > > James Sirota
>> > > >>>>>>>>>  > > PPMC- Apache Metron (Incubating)
>> > > >>>>>>>>>  > > jsirota AT apache DOT org
>> > > >>>>>>>>>  > >
>> > > >>>>>>>>>  > > --
>> > > >>>>>>>>>  > >
>> > > >>>>>>>>>  > > Jon
>> > > >>>>>>>>>  > >
>> > > >>>>>>>>>  > > Sent from my mobile device
>> > > >>>>>>>>>  >
>> > > >>>>>>>>>  > -------------------
>> > > >>>>>>>>>  > Thank you,
>> > > >>>>>>>>>  >
>> > > >>>>>>>>>  > James Sirota
>> > > >>>>>>>>>  > PPMC- Apache Metron (Incubating)
>> > > >>>>>>>>>  > jsirota AT apache DOT org
>> > > >>>>>>>>>  >
>> > > >>>>>>>>>  --
>> > > >>>>>>>>>
>> > > >>>>>>>>>  Jon
>> > > >>>>>>>>>
>> > > >>>>>>>>>  Sent from my mobile device
>> > > >>>>>>> -------------------
>> > > >>>>>>> Thank you,
>> > > >>>>>>>
>> > > >>>>>>> James Sirota
>> > > >>>>>>> PPMC- Apache Metron (Incubating)
>> > > >>>>>>> jsirota AT apache DOT org
>> > > >>>>>>>
>> > > >>>>> --
>> > > >>>>>
>> > > >>>>> Jon
>> > > >>>>>
>> > > >>>>> Sent from my mobile device
>> > > >>>>>
>> > > >>>>
>> > > >> --
>> > > >>
>> > > >> Jon
>> > > >>
>> > > >> Sent from my mobile device
>> > > >>
>> > >
>> > > --
>> >
>> > Jon
>> >
>> > Sent from my mobile device
>> >
>> > --
>>
>> Jon
>>
Mime
View raw message