kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Huang <jason.hu...@icare.com>
Subject Re: Fetch messages since a specific time?
Date Mon, 17 Dec 2012 21:37:35 GMT
I see.

Thanks for your response, Tom.

Jason

On Mon, Dec 17, 2012 at 3:41 PM, Tom Brown <tombrown52@gmail.com> wrote:
> Each message does not have a time stamp. Groups of messages (I think the
> default is around 500mb) are stored in individual files, and the time stamp
> parameter will find the offset at the beginning of the file that has that
> time stamp-- not really helpful for your use case.
>
> The accepted solution is to store the offsets in a DB, or some other
> location.
>
> --Tom
>
> On Monday, December 17, 2012, Mathias Söderberg wrote:
>
>> Hm, alright. Haven't really used the method to anything besides getting
>> first and last offset (using -1 and -2 as timestamps IIRC) of a
>> topic+partition combination.
>>
>> Maybe someone else can shed some light on this?
>>
>> Cheers,
>> Mathias
>>
>>
>> On 17 December 2012 19:51, Jason Huang <jason.huang@icare.com<javascript:;>>
>> wrote:
>>
>> > Mathias,
>> >
>> > Thanks for response. I am not sure if this timestamp is the Unix time
>> > or not. I've tried the following:
>> >
>> > Create 3 messages of the same topic, at the same partition like this:
>> > 1355769714152: Jason has a new message 1
>> > 1355769964900: Jason has a new message 2
>> > 1355769980296: Jason has a new message 3
>> >
>> > I then tried to call getOffsetsBefore with a timestamp = 1355769964999
>> > (99 milliseconds after the timestamp in message two above), hoping to
>> > get some offset but the long array returned by the call is empty.
>> >
>> > Some google search found that getOffsetsBefore is based on the mime of
>> > the log segments. In other words, if I only have one log file
>> > 00000000000000000000.kafka in the topic directory (log/topic-0), then
>> > the offset array returned by this call will always be 0?
>> >
>> > If so, this API is probably not designed for my use case.
>> >
>> > thanks,
>> >
>> > Jason
>> >
>> >
>> > On Mon, Dec 17, 2012 at 1:40 PM, Mathias Söderberg
>> > <mathias.soederberg@gmail.com <javascript:;>> wrote:
>> > > The SimpleConsumer API [1] has a method called getOffsetsBefore which
>> > takes
>> > > a topic, partition, timestamp (UNIX I assume since it's a long) and
>> > integer
>> > > limit on how many offsets to get.
>> > >
>> > > Might not solve your problem *exactly*, but could be useful, unless
>> > you're
>> > > using the ConsumerConnector?
>> > >
>> > > [1]: http://people.apache.org/~joestein/kafka-0.7.1-incubating-docs/
>> > >
>> > >
>> > > On 17 December 2012 19:23, Jason Huang <jason.huang@icare.com<javascript:;>>
>> wrote:
>> > >
>> > >> Hello,
>> > >>
>> > >> Is it possible to fetch messages from the Kafka message queue since
a
>> > >> specific time? For example, a user may subscribe to a topic and the
>> > >> producer will continuously publish messages related to this topic.
The
>> > >> first time this user logs in, we will fetch all the messages from the
>> > >> beginning. However, the next time this user logs in, we want to only
>> > >> fetch the "new" messages. In other words, messages since the user's
>> > >> last log out time.
>> > >>
>> > >> Is there any API in Kafka that allows us to do that? I am not sure
if
>> > >> Kafka actually stores a timestamp with each message as the message's
>> > >> meta data. If not, is there any way to fetch the offset related to
the
>> > >> user's last log out time?
>> > >>
>> > >> One way that I can think of to do this is to store the offset of the
>> > >> last message this user consumers before he logged out of the system
>> > >> (persist this offset at a DB). The next time this user logs in, we
>> > >> will read the DB to get that offset and start from there to fetch
>> > >> messages. However, if there is a better way to do this in Kafka, then
>> > >> it will save me the work to write/read from the DB.
>> > >>
>> > >> thanks!
>> > >>
>> > >> Jason
>> > >>
>> >
>>

Mime
View raw message