metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ali Nazemian <alinazem...@gmail.com>
Subject Re: UI pivotting / aggregation backend
Date Sat, 08 Jul 2017 13:59:49 GMT
Given the fact that some people prefer Solr and some of them
Elasticsearch, having an abstraction layer for Solr and Elasticsearch would
be really great. However, I haven't seen any framework out there that can
provide the required level of search abstraction on top of Solr and
Elasticsearch, but I guess there should be one. Something like Apache
Calcite but more specific to search queries. Without that there is too much
implementation.



On Fri, Jul 7, 2017 at 6:48 PM, Casey Stella <cestella@gmail.com> wrote:

> I just want to chime in and support the notion of an abstraction layer
> between the UI and the indexed stores.  I think that having an API that
> people can conform to is going to be important as people want to plug in
> their own backing indices in the future.
>
> Casey
>
> On Thu, Jul 6, 2017 at 2:11 PM, Justin Leet <justinjleet@gmail.com> wrote:
>
> > I wanted to bring up a some stuff on the backend of our UI, and get
> > thoughts (+ things I overlooked, etc.).  There's also a couple points at
> > the end that merit discussion about how we handle things, since it gets
> > into how we handle our ES templates (since we generally want to aggregate
> > on raw fields, not analyzed ones).
> >
> > To set the use case a bit, when we're looking through alerts in the UI,
> > we're going to want to be able to start pivoting and grouping in the UI.
> >
> > For example, given a list of alerts, we may want to follow a ordering of
> > groupings like so:
> >
> > All Alerts
> > --> Bucketed by User
> > ----> Then further by Destination IP
> > ------> Then further by Severity
> >
> > The stuff I expect we'll want to be able to do:
> > * Pivot through multiple layers (as in the example above).
> > * Get counts within each bucket (Do we have a lot of high severity
> alerts?
> > Mostly medium? etc?)
> > * Get a subset of fields (I assume we don't want every entire doc that
> > comes back in the bucket)
> > * Pagination (if I have > X docs, show me X and let me retrieve more as
> > needed)
> > * Sorting within a bucket (I may want to sort by time, by userid, etc.)
> > * Filtering (Be able to do this stuff while only showing high severity
> > alerts)
> >
> > In terms of actually implementing this, to the best of my limited
> knowledge
> > (and playing around with ES looking into this), this seems like pretty
> > doable stuff, out of the box. See:
> > https://www.elastic.co/guide/en/elasticsearch/reference/2.
> > 4/search-aggregations-bucket-terms-aggregation.html
> >
> > There are two main pain points I see in this:
> > * Actually constructing these queries.  I don't know that we've
> explicitly
> > said we want a layer of abstraction between the UI and the real time
> store,
> > but I strongly suggest we have one.  Theoretically, we should be able to
> > support (at least) Solr and ES in the UI, not just one.  Unfortunately,
> > since they aren't the same syntax, this means we have two impls, and I'd
> > personally like to see an abstraction that delegates appropriately.
> >
> > * Aggregations in ES function post analysis. This means that we'll
> > typically want the raw field value to be able to aggregated on.  In ES
> > implementation, this means a "not_analyzed" field. Glancing (incredibly)
> > briefly through our templates, we do have some string values that are
> > analyzed (and I have no idea if they're generally relevant to this UI or
> > not, I just didn't look).  I'm also assuming Stellar enrichments are
> > analyzed right now.  I'm also unsure what happens to metadata (
> > https://github.com/apache/metron/pull/621)  Essentially the question is:
> > "How do we handle this, particularly since we're a pretty dynamic
> system?"
> >
>



-- 
A.Nazemian

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message