kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Russell Jurney <russell.jur...@gmail.com>
Subject Re: Kafka in AWS?
Date Wed, 21 Mar 2012 20:44:23 GMT
You have code that puts records in bigger blocks on s3? Plz to share? :)

Russell Jurney http://datasyndrome.com

On Mar 21, 2012, at 1:37 PM, Vaibhav Puranik <vpuranik@gmail.com> wrote:

> We also have s3 files organized by date in the following fashion.
> 
> yyyy/MM/dd/hh
> 
> Our messages are in JSON.
> 
> Regards,
> Vaibhav
> 
> On Wed, Mar 21, 2012 at 1:33 PM, Russell Jurney <russell.jurney@gmail.com>wrote:
> 
>> I want the S3 files to be organized by type and date. Folders for types,
>> subfolders for date down to the hour: year/month/day/hour. All payloads of
>> a given type get written together.
>> 
>> It would be ideal if there was no integration with the end format, but in
>> practice I'm not sure if all the serialization protocols mentioned can be
>> written in this way.
>> 
>> Russell Jurney http://datasyndrome.com
>> 
>> On Mar 21, 2012, at 12:50 PM, Tim Lossen <tim@lossen.de> wrote:
>> 
>>> another good option would be messagepack -- flexible & schemaless like
>> json, but binary.
>>> 
>>> Sent from my iPhone
>>> 
>>> On 21 Mar 2012, at 20:46, Russell Jurney <russell.jurney@gmail.com>
>> wrote:
>>> 
>>>> I'm going to use thrift, avro or protobuf for serialization.
>>>> 
>>>> Russell Jurney http://datasyndrome.com
>>>> 
>>>> On Mar 21, 2012, at 11:59 AM, Vaibhav Puranik <vpuranik@gmail.com>
>> wrote:
>>>> 
>>>>> I would use the payload. I want the message to be exactly as it is. We
>> want
>>>>> to name the files as per topic.
>>>>> (That's how we differentiate right now).
>>>>> 
>>>>> Regards,
>>>>> Vaibhav
>>>>> 
>>>>> On Wed, Mar 21, 2012 at 11:01 AM, Niek Sanders <niek.sanders@gmail.com
>>> wrote:
>>>>> 
>>>>>> So what would you like the S3 files to actually look like?
>>>>>> 
>>>>>> One Kafka message body per line?  Should the message topic be tossed
>>>>>> in there too?
>>>>>> 
>>>>>> A tricky aspect is that the Kafka message body is an opaque byte
>>>>>> array.  For my own case I'm using JSON for the payload so it makes
my
>>>>>> requirements simpler.
>>>>>> 
>>>>>> - Niek
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Tue, Mar 20, 2012 at 10:07 PM, Russell Jurney
>>>>>> <russell.jurney@gmail.com> wrote:
>>>>>>> I want events in S3 to process them in Hadoop. I'd like to emit
them
>> in
>>>>>> my app, and have them magically show up in 64MB chunks on S3. Like
>> most
>>>>>> everyone else.
>>>>>>> 
>>>>>>> Russell Jurney http://datasyndrome.com
>>>>>>> 
>>>>>> 
>> 

Mime
View raw message