storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tousif <tousif.pa...@gmail.com>
Subject Re: Trident state acknowledgement
Date Wed, 17 Aug 2016 11:58:30 GMT
Take a look at Complex event processing and windowing

On Wed, Aug 17, 2016 at 5:23 PM, Pratyusha Rasamsetty <
pratyusha.r@raremile.com> wrote:

> Hi Tousif,
>
> Kafka is not present in our architecture as of now. We are not planning to
> add that complexity.
>
> Let's assume we write the parent to some queue, but still the tuple in the
> queue need to be processed only when all children gets inserted into some
> queue, but still the tuple in the queue need to be processed "only when"
> all children gets inserted into elasticsearch.
>
> So first, we need to know whether all children get inserted to
> elasticsearch which we can get to know after from bulkresponse in trident
> state. We need to emit or maintain the state of all successfully inserted
> child tuples, so that I can count them with count aggregation.
>
> That is what my doubt is as of now. How to maintain the information
> of successfully inserted and failed tuples, so that I can later do a check
> and process parent tuple.
>
> Thanks
> Pratyusha
>
> On Wed, Aug 17, 2016 at 4:54 PM, Tousif <tousif.pasha@gmail.com> wrote:
>
>> Hi Pratyusha,
>>
>> Is it possible for you to write back to kafka topic after parent tuple is
>> processed and other topology can read ?
>>
>> On Wed, Aug 17, 2016 at 3:16 PM, Pratyusha Rasamsetty <
>> pratyusha.r@raremile.com> wrote:
>>
>>> Hi all,
>>>
>>> *My requirement is *To process and index a huge data set to
>>> Elasticsearch.
>>>
>>> For each tuple that spout emits, about 100 child tuples gets emitted.
>>> Each of them needs to be processed and indexed to Elasticsearch. Once the
>>> children gets indexed, parent tuple that spout emitted also need to be
>>> indexed by querying the children index which is already indexed.
>>>
>>> I am able to achieve the whole functionality using normal storm. But
>>> since the processing of the tuple is too long, I had to disable guaranteed
>>> message processing.
>>>
>>> Since processing tuples repeatedly is a costly operation for me, I
>>> decided to use storm trident as it claims to be support exactly once
>>> processing.
>>>
>>> The problem here is I could not achieve the complete functionality with
>>> trident.
>>>
>>> I have to index children and based on the bulk response that I get from
>>> elasticsearch, I need to emit some more tuples for further processing. I
>>> understand that we can use trident state for doing batch insert to
>>> elasticsearch. But based on the response I could not emit from trident
>>> state.
>>>
>>> Please help me solve this - "Batch insert and emit based on response
>>> using Trident."
>>>
>>>
>>> Thanks
>>> Pratyusha
>>>
>>
>>
>>
>> --
>>
>>
>> Regards
>> Tousif Khazi
>>
>>
>


-- 


Regards
Tousif Khazi

Mime
View raw message