spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From trung kien <kient...@gmail.com>
Subject Re: Spark Streaming - Kafka Direct Approach: re-compute from specific time
Date Wed, 25 May 2016 18:35:34 GMT
Ah right i see.

Thank you very much.
On May 25, 2016 11:11 AM, "Cody Koeninger" <cody@koeninger.org> wrote:

> There's an overloaded createDirectStream method that takes a map from
> topicpartition to offset for the starting point of the stream.
>
> On Wed, May 25, 2016 at 9:59 AM, trung kien <kientt86@gmail.com> wrote:
> > Thank Cody.
> >
> > I can build the mapping from time ->offset. However how can i pass this
> > offset to Spark Streaming job using that offset? ( using Direct Approach)
> >
> > On May 25, 2016 9:42 AM, "Cody Koeninger" <cody@koeninger.org> wrote:
> >>
> >> Kafka does not yet have meaningful time indexing, there's a kafka
> >> improvement proposal for it but it has gotten pushed back to at least
> >> 0.10.1
> >>
> >> If you want to do this kind of thing, you will need to maintain your
> >> own index from time to offset.
> >>
> >> On Wed, May 25, 2016 at 8:15 AM, trung kien <kientt86@gmail.com> wrote:
> >> > Hi all,
> >> >
> >> > Is there any way to re-compute using Spark Streaming - Kafka Direct
> >> > Approach
> >> > from specific time?
> >> >
> >> > In some cases, I want to re-compute again from specific time (e.g
> >> > beginning
> >> > of day)? is that possible?
> >> >
> >> >
> >> >
> >> > --
> >> > Thanks
> >> > Kien
>

Mime
View raw message