spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Takeshi Yamamuro <linguin....@gmail.com>
Subject Re: spark streaming with kinesis
Date Tue, 15 Nov 2016 02:53:34 GMT
Seems it it not a good design to frequently restart workers in a minute
because
their initialization and shutdown take much time as you said
(e.g., interconnection overheads with dynamodb and graceful shutdown).

Anyway, since this is a kind of questions about the aws kinesis library, so
you'd better to ask aws guys in their forum or something.

// maropu


On Mon, Nov 14, 2016 at 11:20 PM, Shushant Arora <shushantarora09@gmail.com>
wrote:

> 1.No, I want to implement low level consumer on kinesis stream.
> so need to stop the worker once it read the latest sequence number sent by
> driver.
>
> 2.What is the cost of frequent register and deregister of worker node. Is
> that when worker's shutdown is called it will terminate run method but
> leasecoordinator will wait for 2seconds before releasing the lease. So I
> cannot deregister a worker in less than 2 seconds ?
>
> Thanks!
>
>
>
> On Mon, Nov 14, 2016 at 7:36 PM, Takeshi Yamamuro <linguin.m.s@gmail.com>
> wrote:
>
>> Is "aws kinesis get-shard-iterator --shard-iterator-type LATEST" not
>> enough for your usecase?
>>
>> On Mon, Nov 14, 2016 at 10:23 PM, Shushant Arora <
>> shushantarora09@gmail.com> wrote:
>>
>>> Thanks!
>>> Is there a way to get the latest sequence number of all shards of a
>>> kinesis stream?
>>>
>>>
>>>
>>> On Mon, Nov 14, 2016 at 5:43 PM, Takeshi Yamamuro <linguin.m.s@gmail.com
>>> > wrote:
>>>
>>>> Hi,
>>>>
>>>> The time interval can be controlled by `IdleTimeBetweenReadsInMillis`
>>>> in KinesisClientLibConfiguration though,
>>>> it is not configurable in the current implementation.
>>>>
>>>> The detail can be found in;
>>>> https://github.com/apache/spark/blob/master/external/kinesis
>>>> -asl/src/main/scala/org/apache/spark/streaming/kinesis/Kines
>>>> isReceiver.scala#L152
>>>>
>>>> // maropu
>>>>
>>>>
>>>> On Sun, Nov 13, 2016 at 12:08 AM, Shushant Arora <
>>>> shushantarora09@gmail.com> wrote:
>>>>
>>>>> *Hi *
>>>>>
>>>>> *is **spark.streaming.blockInterval* for kinesis input stream is
>>>>> hardcoded to 1 sec or is it configurable ? Time interval at which receiver
>>>>> fetched data from kinesis .
>>>>>
>>>>> Means stream batch interval cannot be less than *spark.streaming.blockInterval
>>>>> and this should be configrable , Also is there any minimum value for
>>>>> streaming batch interval ?*
>>>>>
>>>>> *Thanks*
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> ---
>>>> Takeshi Yamamuro
>>>>
>>>
>>>
>>
>>
>> --
>> ---
>> Takeshi Yamamuro
>>
>
>


-- 
---
Takeshi Yamamuro

Mime
View raw message