metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Lyle <dlyle65...@gmail.com>
Subject Re: [DISCUSS] Turning off indexing writers feature discussion
Date Fri, 13 Jan 2017 15:54:27 GMT
Casey,

Can you give me a level set of what your thinking is now? I think it's
global control of all index types + overrides on a per-type basis. Fwiw,
I'm totally for that, but I want to make sure I'm not imposing my
pre-concieved notions on your consensus-driven ones.

-D....

On Fri, Jan 13, 2017 at 10:44 AM, Casey Stella <cestella@gmail.com> wrote:

> I am suggesting that, yes.  The configs are essentially the same as yours,
> except there is an override specified at the top level.  Without that, in
> order to specify both HDFS and ES have batch sizes of 100, you have to
> explicitly configure each.  It's less that I'm trying to have backwards
> compatibility and more that I'm trying to make the majority case easy: both
> writers write everything to a specified index name with a specified batch
> size (which is what we have now).  Beyond that, I want to allow for
> specifying an override for the config on a writer-by-writer basis for those
> who need it.
>
> On Fri, Jan 13, 2017 at 10:39 AM, Nick Allen <nick@nickallen.org> wrote:
>
> > Are you saying we support all of these variants?  I realize you are
> trying
> > to have some backwards compatibility, but this also makes it harder for a
> > user to grok (for me at least).
> >
> > Personally I like my original example as there are fewer sub-structures,
> > like 'writerConfig', which makes the whole thing simpler and easier to
> > grok.  But maybe others will think your proposal is just as easy to grok.
> >
> >
> >
> > On Fri, Jan 13, 2017 at 10:01 AM, Casey Stella <cestella@gmail.com>
> wrote:
> >
> > > Ok, so here's what I'm thinking based on the discussion:
> > >
> > >    - Keeping the configs that we have now (batchSize and index) as
> > defaults
> > >    for the unspecified writer-specific case
> > >    - Adding the config Nick suggested
> > >
> > > *Base Case*:
> > > {
> > > }
> > >
> > >    - all writers write all messages
> > >    - index named the same as the sensor for all writers
> > >    - batchSize of 1 for all writers
> > >
> > > *Writer-non-specific case*:
> > > {
> > >   "index" : "foo"
> > >  ,"batchSize" : 100
> > > }
> > >
> > >    - All writers write all messages
> > >    - index is named "foo", different from the sensor for all writers
> > >    - batchSize is 100 for all writers
> > >
> > > *Writer-specific case without filters*
> > > {
> > >   "index" : "foo"
> > >  ,"batchSize" : 1
> > >  , "writerConfig" :
> > >    {
> > >       "elasticsearch" : {
> > >                                    "batchSize" : 100
> > >                                  }
> > >    }
> > > }
> > >
> > >    - All writers write all messages
> > >    - index is named "foo", different from the sensor for all writers
> > >    - batchSize is 1 for HDFS and 100 for elasticsearch writers
> > >    - NOTE: I could override the index name too
> > >
> > > *Writer-specific case with filters*
> > > {
> > >   "index" : "foo"
> > >  ,"batchSize" : 1
> > >  , "writerConfig" :
> > >    {
> > >       "elasticsearch" : {
> > >                                    "batchSize" : 100,
> > >                                    "when" : "exists(field1)"
> > >                                  },
> > >       "hdfs" : {
> > >                      "when" : "false"
> > >                   }
> > >    }
> > > }
> > >
> > >    - ES writer writes messages which have field1, HDFS doesn't
> > >    - index is named "foo", different from the sensor for all writers
> > >    - 100 for elasticsearch writers
> > >
> > > Thoughts?
> > >
> > > On Fri, Jan 13, 2017 at 9:44 AM, Carolyn Duby <cduby@hortonworks.com>
> > > wrote:
> > >
> > > > For larger installations you need to control what is indexed so you
> > don’t
> > > > end up with a nasty elastic search situation and so you can mine the
> > data
> > > > later for reports and training ml models.
> > > >
> > > > Thanks
> > > > Carolyn
> > > >
> > > >
> > > >
> > > >
> > > > On 1/13/17, 9:40 AM, "Casey Stella" <cestella@gmail.com> wrote:
> > > >
> > > > >OH that's a good idea!
> > > > >
> > > > >On Fri, Jan 13, 2017 at 9:39 AM, Nick Allen <nick@nickallen.org>
> > wrote:
> > > > >
> > > > >> I like the "Index Filtering" option based on the flexibility
that
> it
> > > > >> provides.  Should each output (HDFS, ES, etc) have its own
> > > configuration
> > > > >> settings?  For example, aren't things like batching handled
> > separately
> > > > for
> > > > >> HDFS versus Elasticsearch?
> > > > >>
> > > > >> Something along the lines of...
> > > > >>
> > > > >> {
> > > > >>   "hdfs" : {
> > > > >>     "when": "exists(field1)",
> > > > >>     "batchSize": 100
> > > > >>   },
> > > > >>
> > > > >>   "elasticsearch" : {
> > > > >>     "when": "true",
> > > > >>     "batchSize": 1000,
> > > > >>     "index": "squid"
> > > > >>   }
> > > > >> }
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >>
> > > > >> On Fri, Jan 13, 2017 at 9:10 AM, Casey Stella <cestella@gmail.com
> >
> > > > wrote:
> > > > >>
> > > > >> > Yeah, I tend to like the first option too.  Any opposition
to
> that
> > > > from
> > > > >> > anyone?
> > > > >> >
> > > > >> > The points brought up are good ones and I think that it
may be
> > > worth a
> > > > >> > broader discussion of the requirements of indexing in a
separate
> > dev
> > > > list
> > > > >> > thread.  Maybe a list of desires with coherent use-cases
> > justifying
> > > > them
> > > > >> so
> > > > >> > we can think about how this stuff should work and where
the
> > natural
> > > > >> > extension points should be.  Afterall, we need to toe the
line
> > > between
> > > > >> > engineering and overengineering for features nobody will
want.
> > > > >> >
> > > > >> > I'm not sure about the extensions to the standard fields.
 I'm
> > torn
> > > > >> between
> > > > >> > the notions that we should have no standard fields vs we
should
> > > have a
> > > > >> > boatload of standard fields (with most of them empty). 
I
> exchange
> > > > >> > positions fairly regularly on that question. ;)  It may
be
> worth a
> > > dev
> > > > >> list
> > > > >> > discussion to lay out how you imagine an extension of standard
> > > fields
> > > > and
> > > > >> > how it might look as implemented in Metron.
> > > > >> >
> > > > >> > Casey
> > > > >> >
> > > > >> > Casey
> > > > >> >
> > > > >> > On Thu, Jan 12, 2017 at 9:58 PM, Kyle Richardson <
> > > > >> > kylerichardson2@gmail.com>
> > > > >> > wrote:
> > > > >> >
> > > > >> > > I'll second my preference for the first option. I think
the
> > > ability
> > > > to
> > > > >> > use
> > > > >> > > Stellar filters to customize indexing would be a big
win.
> > > > >> > >
> > > > >> > > I'm glad Matt brought up the point about data lake
and CEP. I
> > > think
> > > > >> this
> > > > >> > is
> > > > >> > > a really important use case that we need to consider.
Take a
> > > simple
> > > > >> > > example... If I have data coming in from 3 different
firewall
> > > > vendors
> > > > >> > and 2
> > > > >> > > different web proxy/url filtering vendors and I want
to be
> able
> > to
> > > > >> > analyze
> > > > >> > > that data set, I need the data to be indexed all together
> > (likely
> > > in
> > > > >> > HDFS)
> > > > >> > > and to have a normalized schema such that IP address,
URL, and
> > > user
> > > > >> name
> > > > >> > > (to take a few) can be easily queried and aggregated.
I can
> also
> > > > >> envision
> > > > >> > > scenarios where I would want to index data based on
attributes
> > > other
> > > > >> than
> > > > >> > > sensor, business unit or subsidiary for example.
> > > > >> > >
> > > > >> > > I've been wanted to propose extending our 7 standard
fields to
> > > > include
> > > > >> > > things like URL and user. Is there community interest/support
> > for
> > > > >> moving
> > > > >> > in
> > > > >> > > that direction? If so, I'll start a new thread.
> > > > >> > >
> > > > >> > > Thanks!
> > > > >> > >
> > > > >> > > -Kyle
> > > > >> > >
> > > > >> > > On Thu, Jan 12, 2017 at 6:51 PM, Matt Foley <mattf@apache.org
> >
> > > > wrote:
> > > > >> > >
> > > > >> > > > Ah, I see.  If overriding the default index name
allows
> using
> > > the
> > > > >> same
> > > > >> > > > name for multiple sensors, then the goal can be
achieved.
> > > > >> > > > Thanks,
> > > > >> > > > --Matt
> > > > >> > > >
> > > > >> > > >
> > > > >> > > > On 1/12/17, 3:30 PM, "Casey Stella" <cestella@gmail.com>
> > wrote:
> > > > >> > > >
> > > > >> > > >     Oh, you could!  Let's say you have a syslog
parser with
> > data
> > > > from
> > > > >> > > > sources 1
> > > > >> > > >     2 and 3.  You'd end up with one kafka queue
with 3
> parsers
> > > > >> attached
> > > > >> > > to
> > > > >> > > > that
> > > > >> > > >     queue, each picking part the messages from
source 1, 2
> and
> > > 3.
> > > > >> > They'd
> > > > >> > > > go
> > > > >> > > >     through separate enrichment and into the indexing
> > topology.
> > > > In
> > > > >> the
> > > > >> > > >     indexing topology, you could specify the same
index name
> > > > "syslog"
> > > > >> > and
> > > > >> > > > all
> > > > >> > > >     of the messages go into the same index for
CEP querying
> if
> > > so
> > > > >> > > desired.
> > > > >> > > >
> > > > >> > > >     On Thu, Jan 12, 2017 at 6:27 PM, Matt Foley
<
> > > mattf@apache.org
> > > > >
> > > > >> > > wrote:
> > > > >> > > >
> > > > >> > > >     > Syslog is hell on parsers – I know,
I worked at
> LogLogic
> > > in
> > > > a
> > > > >> > > > previous
> > > > >> > > >     > life.  It makes perfect sense to route
different lines
> > > from
> > > > >> > syslog
> > > > >> > > > through
> > > > >> > > >     > different appropriate parsers.  But a
lot of what the
> > > > parsers
> > > > >> do
> > > > >> > is
> > > > >> > > >     > identify consistent subsets of metadata
and annotate
> it
> > –
> > > > eg,
> > > > >> > > > src_ip_addr,
> > > > >> > > >     > event timestamps, etc.  Once those metadata
are
> > annotated
> > > > and
> > > > >> > > > available
> > > > >> > > >     > with common field names, why doesn’t
it make sense to
> > > index
> > > > the
> > > > >> > > > messages
> > > > >> > > >     > together, for CEP querying?  I think
Splunk has
> > > illustrated
> > > > >> this
> > > > >> > > > model.
> > > > >> > > >     >
> > > > >> > > >     > On 1/12/17, 3:00 PM, "Casey Stella" <
> cestella@gmail.com
> > >
> > > > >> wrote:
> > > > >> > > >     >
> > > > >> > > >     >     yeah, I mean, honestly, I think the
approach that
> > > we've
> > > > >> taken
> > > > >> > > for
> > > > >> > > >     > sources
> > > > >> > > >     >     which aggregate different types of
data is to
> > provide
> > > > >> filters
> > > > >> > > at
> > > > >> > > > the
> > > > >> > > >     > parser
> > > > >> > > >     >     level and have multiple parser topologies
(with
> > > > different,
> > > > >> > > > possibly
> > > > >> > > >     >     mutually exclusive filters) running.
 This would
> be
> > a
> > > > >> > > completely
> > > > >> > > >     > separate
> > > > >> > > >     >     sensor.  Imagine a syslog data source
that
> > aggregates
> > > > and
> > > > >> you
> > > > >> > > > want to
> > > > >> > > >     > pick
> > > > >> > > >     >     apart certain pieces of messages.
 This is why the
> > > > initial
> > > > >> > > > thought and
> > > > >> > > >     >     architecture was one index per sensor.
> > > > >> > > >     >
> > > > >> > > >     >     On Thu, Jan 12, 2017 at 5:55 PM,
Matt Foley <
> > > > >> > mattf@apache.org>
> > > > >> > > > wrote:
> > > > >> > > >     >
> > > > >> > > >     >     > I’m thinking that CEP (Complex
Event Processing)
> > is
> > > > >> > contrary
> > > > >> > > > to the
> > > > >> > > >     > idea
> > > > >> > > >     >     > of silo-ing data per sensor.
> > > > >> > > >     >     > Now it’s true that some of
those sensors are
> > already
> > > > >> > > > aggregating
> > > > >> > > >     > data from
> > > > >> > > >     >     > multiple sources, so maybe I’m
wrong here.
> > > > >> > > >     >     > But it just seems to me that
the “data lake”
> > > insights
> > > > >> come
> > > > >> > > from
> > > > >> > > >     > being able
> > > > >> > > >     >     > to make decisions over the whole
mass of data
> > rather
> > > > than
> > > > >> > > just
> > > > >> > > >     > vertical
> > > > >> > > >     >     > slices of it.
> > > > >> > > >     >     >
> > > > >> > > >     >     > On 1/12/17, 2:15 PM, "Casey
Stella" <
> > > > cestella@gmail.com>
> > > > >> > > > wrote:
> > > > >> > > >     >     >
> > > > >> > > >     >     >     Hey Matt,
> > > > >> > > >     >     >
> > > > >> > > >     >     >     Thanks for the comment!
> > > > >> > > >     >     >     1. At the moment, we only
have one index
> name,
> > > the
> > > > >> > > default
> > > > >> > > > of
> > > > >> > > >     > which is
> > > > >> > > >     >     > the
> > > > >> > > >     >     >     sensor name but that's entirely
up to the
> > user.
> > > > This
> > > > >> > is
> > > > >> > > > sensor
> > > > >> > > >     >     > specific,
> > > > >> > > >     >     >     so it'd be a separate config
for each
> sensor.
> > > If
> > > > we
> > > > >> > want
> > > > >> > > > to
> > > > >> > > >     > build
> > > > >> > > >     >     > multiple
> > > > >> > > >     >     >     indices per sensor, we'd
have to think
> > carefully
> > > > >> about
> > > > >> > > how
> > > > >> > > > to do
> > > > >> > > >     > that
> > > > >> > > >     >     > and
> > > > >> > > >     >     >     would be a bigger undertaking.
 I guess I
> can
> > > see
> > > > the
> > > > >> > > use,
> > > > >> > > > though
> > > > >> > > >     >     > (redirect
> > > > >> > > >     >     >     messages to one index vs
another based on a
> > > > predicate
> > > > >> > for
> > > > >> > > > a given
> > > > >> > > >     >     > sensor).
> > > > >> > > >     >     >     Anyway, not where I was
originally thinking
> > that
> > > > this
> > > > >> > > > discussion
> > > > >> > > >     > would
> > > > >> > > >     >     > go,
> > > > >> > > >     >     >     but it's an interesting
point.
> > > > >> > > >     >     >
> > > > >> > > >     >     >     2. I hadn't thought through
the
> implementation
> > > > quite
> > > > >> > yet,
> > > > >> > > > but we
> > > > >> > > >     > don't
> > > > >> > > >     >     >     actually have a splitter
bolt in that
> > topology,
> > > > just
> > > > >> a
> > > > >> > > > spout
> > > > >> > > >     > that goes
> > > > >> > > >     >     > to
> > > > >> > > >     >     >     the elasticsearch writer
and also to the
> hdfs
> > > > writer.
> > > > >> > > >     >     >
> > > > >> > > >     >     >     On Thu, Jan 12, 2017 at
4:52 PM, Matt Foley
> <
> > > > >> > > > mattf@apache.org>
> > > > >> > > >     > wrote:
> > > > >> > > >     >     >
> > > > >> > > >     >     >     > Casey, good to have
controls like this.
> > > Couple
> > > > >> > > > questions:
> > > > >> > > >     >     >     >
> > > > >> > > >     >     >     > 1. Regarding the “index”
: “squid”
> > name/value
> > > > pair,
> > > > >> > is
> > > > >> > > > the
> > > > >> > > >     > index name
> > > > >> > > >     >     >     > expected to always
be a sensor name?  Or
> is
> > > the
> > > > >> given
> > > > >> > > > json
> > > > >> > > >     > structure
> > > > >> > > >     >     >     > subordinate to a sensor
name in zookeeper?
> > Or
> > > > can
> > > > >> we
> > > > >> > > > build
> > > > >> > > >     > arbitrary
> > > > >> > > >     >     >     > indexes with this new
specification,
> > > > independent of
> > > > >> > > > sensor?
> > > > >> > > >     > Should
> > > > >> > > >     >     > there
> > > > >> > > >     >     >     > actually be a list
of “indexes”, ie
> > > > >> > > >     >     >     > { “indexes” : [
> > > > >> > > >     >     >     >         {“index”
: “name1”,
> > > > >> > > >     >     >     >                 …
> > > > >> > > >     >     >     >         },
> > > > >> > > >     >     >     >         {“index”
: “name2”,
> > > > >> > > >     >     >     >                 …
> > > > >> > > >     >     >     >         } ]
> > > > >> > > >     >     >     > }
> > > > >> > > >     >     >     >
> > > > >> > > >     >     >     > 2. Would the filtering
/ writer selection
> > > logic
> > > > >> take
> > > > >> > > > place in
> > > > >> > > >     > the
> > > > >> > > >     >     > indexing
> > > > >> > > >     >     >     > topology splitter bolt?
 Seems like that
> > would
> > > > have
> > > > >> > the
> > > > >> > > >     > smallest
> > > > >> > > >     >     > impact on
> > > > >> > > >     >     >     > current implementation,
no?
> > > > >> > > >     >     >     >
> > > > >> > > >     >     >     > Sorry if these are
already answered in
> > > PR-415, I
> > > > >> > > haven’t
> > > > >> > > > had
> > > > >> > > >     > time to
> > > > >> > > >     >     >     > review that one yet.
> > > > >> > > >     >     >     > Thanks,
> > > > >> > > >     >     >     > --Matt
> > > > >> > > >     >     >     >
> > > > >> > > >     >     >     >
> > > > >> > > >     >     >     > On 1/12/17, 12:55 PM,
"Michael Miklavcic"
> <
> > > > >> > > >     >     > michael.miklavcic@gmail.com>
> > > > >> > > >     >     >     > wrote:
> > > > >> > > >     >     >     >
> > > > >> > > >     >     >     >     I like the flexibility
and
> > expressibility
> > > of
> > > > >> the
> > > > >> > > > first
> > > > >> > > >     > option
> > > > >> > > >     >     > with
> > > > >> > > >     >     >     > Stellar
> > > > >> > > >     >     >     >     filters.
> > > > >> > > >     >     >     >
> > > > >> > > >     >     >     >     M
> > > > >> > > >     >     >     >
> > > > >> > > >     >     >     >     On Thu, Jan 12,
2017 at 1:51 PM, Casey
> > > > Stella <
> > > > >> > > >     >     > cestella@gmail.com>
> > > > >> > > >     >     >     > wrote:
> > > > >> > > >     >     >     >
> > > > >> > > >     >     >     >     > As of METRON-652
<
> > > > https://github.com/apache/
> > > > >> > > >     >     >     > incubator-metron/pull/415>,
we
> > > > >> > > >     >     >     >     > will have
decoupled the indexing
> > > > >> configuration
> > > > >> > > > from the
> > > > >> > > >     >     > enrichment
> > > > >> > > >     >     >     >     > configuration.
 As an immediate
> > > follow-up
> > > > to
> > > > >> > > that,
> > > > >> > > > I'd
> > > > >> > > >     > like to
> > > > >> > > >     >     >     > provide the
> > > > >> > > >     >     >     >     > ability to
turn off and on writers
> via
> > > the
> > > > >> > > > configs.  I'd
> > > > >> > > >     > like
> > > > >> > > >     >     > to get
> > > > >> > > >     >     >     > some
> > > > >> > > >     >     >     >     > community
feedback on how the
> > > > functionality
> > > > >> > > should
> > > > >> > > > work,
> > > > >> > > >     > if
> > > > >> > > >     >     > y'all are
> > > > >> > > >     >     >     >     > amenable.
:)
> > > > >> > > >     >     >     >     >
> > > > >> > > >     >     >     >     >
> > > > >> > > >     >     >     >     > As of now,
we have 3 possible
> writers
> > > > which
> > > > >> can
> > > > >> > > be
> > > > >> > > > used
> > > > >> > > >     > in the
> > > > >> > > >     >     >     > indexing
> > > > >> > > >     >     >     >     > topology:
> > > > >> > > >     >     >     >     >
> > > > >> > > >     >     >     >     >    - Solr
> > > > >> > > >     >     >     >     >    - Elasticsearch
> > > > >> > > >     >     >     >     >    - HDFS
> > > > >> > > >     >     >     >     >
> > > > >> > > >     >     >     >     > HDFS is always
used, elasticsearch
> or
> > > > solr is
> > > > >> > > used
> > > > >> > > >     > depending
> > > > >> > > >     >     > on how
> > > > >> > > >     >     >     > you
> > > > >> > > >     >     >     >     > start the
indexing topology.
> > > > >> > > >     >     >     >     >
> > > > >> > > >     >     >     >     > A couple of
proposals come to mind
> > > > >> immediately:
> > > > >> > > >     >     >     >     >
> > > > >> > > >     >     >     >     > *Index Filtering*
> > > > >> > > >     >     >     >     >
> > > > >> > > >     >     >     >     > You would
be able to specify a
> filter
> > as
> > > > >> > defined
> > > > >> > > > by a
> > > > >> > > >     > stellar
> > > > >> > > >     >     >     > statement
> > > > >> > > >     >     >     >     > (likely a
reuse of the StellarFilter
> > > that
> > > > >> > exists
> > > > >> > > > in the
> > > > >> > > >     >     > Parsers)
> > > > >> > > >     >     >     > which
> > > > >> > > >     >     >     >     > would allow
you to indicate on a
> > > > >> > > > message-by-message basis
> > > > >> > > >     >     > whether or
> > > > >> > > >     >     >     > not to
> > > > >> > > >     >     >     >     > write the
message.
> > > > >> > > >     >     >     >     >
> > > > >> > > >     >     >     >     > The semantics
of this would be as
> > > follows:
> > > > >> > > >     >     >     >     >
> > > > >> > > >     >     >     >     >    - Default
(i.e. unspecified) is
> to
> > > pass
> > > > >> > > > everything
> > > > >> > > >     > through
> > > > >> > > >     >     > (hence
> > > > >> > > >     >     >     >     >    backwards
compatible with the
> > current
> > > > >> > default
> > > > >> > > > config).
> > > > >> > > >     >     >     >     >    - Messages
which have the
> > associated
> > > > >> stellar
> > > > >> > > > statement
> > > > >> > > >     >     > evaluate
> > > > >> > > >     >     >     > to true
> > > > >> > > >     >     >     >     >    for the
writer type will be
> > written,
> > > > >> > otherwise
> > > > >> > > > not.
> > > > >> > > >     >     >     >     >
> > > > >> > > >     >     >     >     >
> > > > >> > > >     >     >     >     > Sample indexing
config which would
> > write
> > > > out
> > > > >> no
> > > > >> > > > messages
> > > > >> > > >     > to
> > > > >> > > >     >     > HDFS and
> > > > >> > > >     >     >     > write
> > > > >> > > >     >     >     >     > out only messages
containing a field
> > > > called
> > > > >> > > > "field1":
> > > > >> > > >     >     >     >     > {
> > > > >> > > >     >     >     >     >    "index"
: "squid"
> > > > >> > > >     >     >     >     >   ,"batchSize"
: 100
> > > > >> > > >     >     >     >     >   ,"filters"
: {
> > > > >> > > >     >     >     >     >       "HDFS"
: "false"
> > > > >> > > >     >     >     >     >      ,"ES"
: "exists(field1)"
> > > > >> > > >     >     >     >     >          
       }
> > > > >> > > >     >     >     >     > }
> > > > >> > > >     >     >     >     >
> > > > >> > > >     >     >     >     > *Index On/Off
Switch*
> > > > >> > > >     >     >     >     >
> > > > >> > > >     >     >     >     > A simpler
solution would be to just
> > > > provide a
> > > > >> > > list
> > > > >> > > > of
> > > > >> > > >     > writers
> > > > >> > > >     >     > to
> > > > >> > > >     >     >     > write
> > > > >> > > >     >     >     >     > messages.
 The semantics would be as
> > > > follows:
> > > > >> > > >     >     >     >     >
> > > > >> > > >     >     >     >     >    - If the
list is unspecified,
> then
> > > the
> > > > >> > default
> > > > >> > > > is to
> > > > >> > > >     > write
> > > > >> > > >     >     > all
> > > > >> > > >     >     >     > messages
> > > > >> > > >     >     >     >     >    for every
writer in the indexing
> > > > topology
> > > > >> > > >     >     >     >     >    - If the
list is specified, then
> a
> > > > writer
> > > > >> > will
> > > > >> > > > write
> > > > >> > > >     > all
> > > > >> > > >     >     > messages
> > > > >> > > >     >     >     > if and
> > > > >> > > >     >     >     >     >    only if
it is named in the list.
> > > > >> > > >     >     >     >     >
> > > > >> > > >     >     >     >     > Sample indexing
config which turns
> off
> > > > HDFS
> > > > >> and
> > > > >> > > > keeps on
> > > > >> > > >     >     >     > Elasticsearch:
> > > > >> > > >     >     >     >     > {
> > > > >> > > >     >     >     >     >    "index"
: "squid"
> > > > >> > > >     >     >     >     >   ,"batchSize"
: 100
> > > > >> > > >     >     >     >     >   ,"writers"
: [ "ES" ]
> > > > >> > > >     >     >     >     > }
> > > > >> > > >     >     >     >     >
> > > > >> > > >     >     >     >     > Thanks in
advance for the feedback!
> > > > Also, if
> > > > >> > you
> > > > >> > > > have
> > > > >> > > >     > any
> > > > >> > > >     >     > other,
> > > > >> > > >     >     >     > better
> > > > >> > > >     >     >     >     > ideas than
the ones presented here,
> > let
> > > me
> > > > >> know
> > > > >> > > > too.
> > > > >> > > >     >     >     >     >
> > > > >> > > >     >     >     >     > Best,
> > > > >> > > >     >     >     >     >
> > > > >> > > >     >     >     >     > Casey
> > > > >> > > >     >     >     >     >
> > > > >> > > >     >     >     >
> > > > >> > > >     >     >     >
> > > > >> > > >     >     >     >
> > > > >> > > >     >     >     >
> > > > >> > > >     >     >     >
> > > > >> > > >     >     >
> > > > >> > > >     >     >
> > > > >> > > >     >     >
> > > > >> > > >     >     >
> > > > >> > > >     >     >
> > > > >> > > >     >
> > > > >> > > >     >
> > > > >> > > >     >
> > > > >> > > >     >
> > > > >> > > >     >
> > > > >> > > >
> > > > >> > > >
> > > > >> > > >
> > > > >> > > >
> > > > >> > >
> > > > >> >
> > > > >>
> > > > >>
> > > > >>
> > > > >> --
> > > > >> Nick Allen <nick@nickallen.org>
> > > > >>
> > > >
> > >
> >
> >
> >
> > --
> > Nick Allen <nick@nickallen.org>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message