metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ali Nazemian <alinazem...@gmail.com>
Subject Re: Heterogeneous indexing batch size for different Metron feeds
Date Wed, 06 Dec 2017 13:18:00 GMT
Everything looks normal except the high number of failed tuples. Do you
know how the indexing batch size works? Based on our observations it seems
it doesn't update the messages that are in enrichments and indexing topics.

On Thu, Dec 7, 2017 at 12:13 AM, Otto Fowler <ottobackwards@gmail.com>
wrote:

> What do you see in the storm ui for the indexing topology?
>
>
> On December 6, 2017 at 07:10:17, Ali Nazemian (alinazemian@gmail.com)
> wrote:
>
> Both hdfs and Elasticsearch batch sizes. There is no error in the logs. It
> mpacts topology error rate and cause almost 90% error rate on indexing
> tuples.
>
> On 6 Dec. 2017 00:20, "Otto Fowler" <ottobackwards@gmail.com> wrote:
>
> Where are you seeing the errors?  Screenshot?
>
>
> On December 5, 2017 at 08:03:46, Otto Fowler (ottobackwards@gmail.com)
> wrote:
>
> Which of the indexing options are you changing the batch size for?  HDFS?
> Elasticsearch?  Both?
>
> Can you give an example?
>
>
>
> On December 5, 2017 at 02:09:29, Ali Nazemian (alinazemian@gmail.com)
> wrote:
>
> No specific error in the logs. I haven't enabled debug/trace, though.
>
> On Tue, Dec 5, 2017 at 11:54 AM, Otto Fowler <ottobackwards@gmail.com>
> wrote:
>
>> My first thought is what are the errors when you get a high error rate?
>>
>>
>> On December 4, 2017 at 19:34:29, Ali Nazemian (alinazemian@gmail.com)
>> wrote:
>>
>> Any thoughts?
>>
>> On Sun, Dec 3, 2017 at 11:27 PM, Ali Nazemian <alinazemian@gmail.com>
>> wrote:
>>
>> > Hi,
>> >
>> > We have noticed recently that no matter what batch size we use for
>> Metron
>> > indexing feeds, as long as we start using different batch size for
>> > different Metron feeds, indexing topology throughput will start dropping
>> > due to the high error rate! So I was wondering whether based on the
>> current
>> > indexing topology design, we have to choose the same batch size for all
>> the
>> > feeds or not. Otherwise, throughout will be dropped. I assume since it
>> is
>> > acceptable to use different batch sizes for different feeds, it is not
>> > expected by design.
>> >
>> > Moreover, I have noticed in practice that even if we change the batch
>> > size, it will not affect the messages that are already in enrichments or
>> > indexing topics, and it will only affect the new messages that are
>> coming
>> > to the parser. Therefore, we need to let all the messages pass the
>> indexing
>> > topology so that we can change the batch size!
>> >
>> > It would be great if we can have more details regarding the design of
>> this
>> > section so we can understand our observations are based on the design or
>> > some kind of bug.
>> >
>> > Regards,
>> > Ali
>> >
>>
>>
>>
>> --
>> A.Nazemian
>>
>>
>
>
> --
> A.Nazemian
>
>
>


-- 
A.Nazemian

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message