metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michel Sumbul <michelsum...@gmail.com>
Subject Re: Architectural reason to split in 4 topologies / impact on the kafka ressources
Date Mon, 25 Jun 2018 22:12:48 GMT
Hi Casey,

Thats make completely sense.
Short question, if there is no enrichment or no profiling, does the message
still pass through the enrichment/profiling topic?

If yes, do you think its possible to imagine a way that for messages that
doesn't need enrichment or profiling to skip the topic and to go directly
to the next one? This is again to avoid in/out in kafka.

Thanks for the explaination,
Michel

2018-06-23 3:58 GMT+01:00 Casey Stella <cestella@gmail.com>:

> Hey Michel,
>
> Those are good questions and there were some reasons surrounding that.  In
> fact, historically, we had fewer topologies (e.g. indexing and enrichment
> were merged). Even earlier on, we had just one giant topology per parser
> that enriched and indexed.  The long story short is that we moved this way
> because we saw how people were using metron and we gained more insight
> tuning Metron.  That led us down this architectural path.
>
> Some of the reasons that we went this way:
>
>    - Fewer large topologies were a nightmare to tune
>       - Enrichment would have different memory requirements than, say,
>       parsers or indexing
>       - You can adjust the kafka topic params per topology to adjust the
>       number of partitions, etc.
>    - Having the separate topologies gives a natural set of extension points
>    for customization and enhancement (e.g. you want a phase between parsing
>    and enrichment).
>    - Decoupling the topologies lets us spin up and down parts of Metron
>    without affecting others (e.g. you don't have to take down enrichments
> to
>    add a parser, even for a moment)
>    - The movement to Flux meant we were limited in how much we could adjust
>    the topology at runtime (e.g. colocating parsers and enrichment would
> mean
>    moving away from flux essentially as the topology changes its structure)
>
> Best,
>
> Casey
>
>
> On Fri, Jun 22, 2018 at 5:25 PM Michel Sumbul <michelsumbul@gmail.com>
> wrote:
>
> > Hi Everyone,
> >
> > I was asking myself what was the architectural reason to split the
> > ingestion in metron in 4 differents toppologies that all read/write to
> > kafka?
> >
> > For example, why the parsing and enrichment topologies have not been
> > merged? Would it not be possible when you parse the message to directly
> > enricht it?
> >
> > Im asking that because splitting in several topologies means that all of
> > the topologies read/write to Kafka, which produce a bigger load on the
> > kafka cluster and then a need for way more infrastructure/servers. The
> cost
> > is especially true when we speak about TBs of data ingested every day.
> >
> > Im sure there were a very good reason, I was just curious.
> >
> > Thanks,
> > Michel
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message