kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anurag <anurag.pha...@gmail.com>
Subject Re: Aggregating tomcat, log4j, other logs in realtime
Date Thu, 29 Sep 2011 18:54:06 GMT
Eric,
Thx.... we use log rotate on hourly basis, just wanted to know if
there's anything different that we might be missing.

-anurag


On Thu, Sep 29, 2011 at 11:47 AM, Eric Hauser <ewhauser@gmail.com> wrote:
> Anurag,
>
> I wouldn't tail the log files, but instead make use of Apache's
> facilities to pipe the logs to another program:
>
> http://httpd.apache.org/docs/2.2/mod/core.html#errorlog
> http://httpd.apache.org/docs/2.0/programs/rotatelogs.html
>
>
> On Thu, Sep 29, 2011 at 2:38 PM, Anurag <anurag.phadke@gmail.com> wrote:
>> Eric/Jun,
>> Can you throw some light on how to handle apache log rotation? afaik,
>> even if we write custom code to tail a file, the file handle is lost
>> on rotation and might result in some loss of data.
>>
>>
>> On Thu, Sep 29, 2011 at 11:35 AM, Jeremy Hanna
>> <jeremy.hanna1234@gmail.com> wrote:
>>> Thanks a lot for the comparison Eric.  Really good to hear a perspective from
a user of both.
>>>
>>> On Sep 29, 2011, at 1:25 PM, Eric Hauser wrote:
>>>
>>>> Jeremy,
>>>>
>>>> I've used both Flume and Kafka, and I can provide some info for comparison:
>>>>
>>>> Flume
>>>> - The current Flume release 0.9.4 has some pretty nasty bugs in it
>>>> (most have been fixed in trunk).
>>>> - Flume is a more complex to maintain operations-wise (IMO) than Kafka
>>>> since you have to setup masters and collectors (you don't necessarily
>>>> need collectors if you aren't writing to HDFS)
>>>> - Flume has a well defined pattern for doing what you want:
>>>> http://www.cloudera.com/blog/2010/09/using-flume-to-collect-apache-2-web-server-logs/
>>>>
>>>> Kafka
>>>> - If you need multiple Kafka partitions for the logs, you will want to
>>>> partition by host so the messages arrive in order for the same host
>>>> - You can use the same piped technique as Flume to publish to Kafka,
>>>> but you'll have to write a little code to publish and subscribe to the
>>>> stream
>>>> - Kafka does not provide any of the file rolling, compression, etc.
>>>> that Flume provides
>>>> - If you ever want to do anything more interesting with those log
>>>> files than just send them to one location, publishing them to Kafka
>>>> would allow you to add additional consumers later.  Flume has a
>>>> concept of fanout sinks, but I don't care for the way it works.
>>>>
>>>>
>>>>
>>>> On Thu, Sep 29, 2011 at 1:48 PM, Jun Rao <junrao@gmail.com> wrote:
>>>>> Jeremy,
>>>>>
>>>>> Yes, Kafka will be a good fit for that.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Jun
>>>>>
>>>>> On Thu, Sep 29, 2011 at 10:12 AM, Jeremy Hanna
>>>>> <jeremy.hanna1234@gmail.com>wrote:
>>>>>
>>>>>> We have a number of web servers in ec2 and periodically we just blow
them
>>>>>> away and create new ones.  That makes keeping logs problematic.
 We're
>>>>>> looking for a way to stream the logs from those various sources directly
to
>>>>>> a central log server - either just a single server or hdfs or something
like
>>>>>> that.
>>>>>>
>>>>>> My question is whether kafka is a good fit for that or should I be
looking
>>>>>> more along the lines of flume or scribe?
>>>>>>
>>>>>> Many thanks.
>>>>>>
>>>>>> Jeremy
>>>>>
>>>
>>>
>>
>

Mime
View raw message