storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pratyusha Rasamsetty <pratyush...@raremile.com>
Subject Re: Trident state acknowledgement
Date Wed, 17 Aug 2016 11:53:46 GMT
Hi Tousif,

Kafka is not present in our architecture as of now. We are not planning to
add that complexity.

Let's assume we write the parent to some queue, but still the tuple in the
queue need to be processed only when all children gets inserted into some
queue, but still the tuple in the queue need to be processed "only when"
all children gets inserted into elasticsearch.

So first, we need to know whether all children get inserted to
elasticsearch which we can get to know after from bulkresponse in trident
state. We need to emit or maintain the state of all successfully inserted
child tuples, so that I can count them with count aggregation.

That is what my doubt is as of now. How to maintain the information
of successfully inserted and failed tuples, so that I can later do a check
and process parent tuple.

Thanks
Pratyusha

On Wed, Aug 17, 2016 at 4:54 PM, Tousif <tousif.pasha@gmail.com> wrote:

> Hi Pratyusha,
>
> Is it possible for you to write back to kafka topic after parent tuple is
> processed and other topology can read ?
>
> On Wed, Aug 17, 2016 at 3:16 PM, Pratyusha Rasamsetty <
> pratyusha.r@raremile.com> wrote:
>
>> Hi all,
>>
>> *My requirement is *To process and index a huge data set to
>> Elasticsearch.
>>
>> For each tuple that spout emits, about 100 child tuples gets emitted.
>> Each of them needs to be processed and indexed to Elasticsearch. Once the
>> children gets indexed, parent tuple that spout emitted also need to be
>> indexed by querying the children index which is already indexed.
>>
>> I am able to achieve the whole functionality using normal storm. But
>> since the processing of the tuple is too long, I had to disable guaranteed
>> message processing.
>>
>> Since processing tuples repeatedly is a costly operation for me, I
>> decided to use storm trident as it claims to be support exactly once
>> processing.
>>
>> The problem here is I could not achieve the complete functionality with
>> trident.
>>
>> I have to index children and based on the bulk response that I get from
>> elasticsearch, I need to emit some more tuples for further processing. I
>> understand that we can use trident state for doing batch insert to
>> elasticsearch. But based on the response I could not emit from trident
>> state.
>>
>> Please help me solve this - "Batch insert and emit based on response
>> using Trident."
>>
>>
>> Thanks
>> Pratyusha
>>
>
>
>
> --
>
>
> Regards
> Tousif Khazi
>
>

Mime
View raw message