kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joseph Lawson <jlaw...@roomkey.com>
Subject RE: Apache webserver access logs + Kafka producer
Date Thu, 07 Aug 2014 21:40:01 GMT
PS you can also try just feeding the logs into a Kafka console producer by doing:

TransferLog "| /opt/kafka/bin/kafka-console-producer.sh --topic apache --broker-list broker-1:9092"
ErrorLog "| /opt/kafka/bin/kafka-console-producer.sh --topic apache-errors --broker-list broker-1:9092"

You can also pipe a custom log into it as well :)
From: Joseph Lawson <jlawson@roomkey.com>
Sent: Thursday, August 07, 2014 5:35 PM
To: users@kafka.apache.org; Philip O'Toole
Subject: RE: Apache webserver access logs + Kafka producer

Check out my logstash-kafka project:


I believe the plugin will be merged into logstash itself soon but for now you can make it

I would suggest making your apache format in json in your apache config and then stream the
data through the logstash kafka output (producer) and parse it on the other side with logstash
input (kafka consumer)

Try something like:

LogFormat "{\"@timestamp\":\"%{%Y-%m-%dT%H:%M:%S%z}t\",\"mod_proxy\":{\"x-forwarded-for\":\"%{X-Forwarded-For}i\"},\"mod_headers\":{\"referer\":\"%{Referer}i\",\"user-agent\":\"%{User-Agent}i\",\"host\":\"%{Host}i\"},\"mod_log\":{\"server_name\":\"%V\",\"remote_logname\":\"%l\",\"remote_user\":\"%u\",\"first_request\":\"%r\",\"last_request_status\":\"%>s\",\"response_size_bytes\":%B,\"duration_usec\":
%D,\"@version\":1 }" logstash_json

CustomLog "|rotatelogs /var/log/httpd/access_log_json-%s 3600" logstash_json

From: Philip O'Toole <philip.otoole@yahoo.com.INVALID>
Sent: Thursday, August 07, 2014 3:01 PM
To: users@kafka.apache.org
Subject: Re: Apache webserver access logs + Kafka producer

Fluentd might work or simply configure rsyslog or syslog-ng on the box to watch the Apache
log files, and send them to a suitable Producer (for example I wrote something that will accept
messages from a syslog client, and stream them to Kafka.  https://github.com/otoolep/syslog-gollector)

More ideas here:


On Tuesday, August 5, 2014 2:48 PM, Florian Dambrine <florian@gumgum.com> wrote:

You might be interested by something like Logstash http://logstash.org for
logs and event processing.



Le 5 août 2014 23:17, "Jonathan Weeks" <jonathanbweeks@gmail.com> a écrit :

> You can look at something like:
> https://github.com/harelba/tail2kafka
> (although I
 don’t know what the effort would be to update it, as it
> doesn’t look like it has been updated in a couple years)
> We are using flume to gather logs, and then sending them to a kafka
> cluster via a flume kafka sink — e.g..
> https://github.com/thilinamb/flume-ng-kafka-sink
> -Jonathan
> On Aug 5, 2014, at 1:40 PM, mvs.sree@gmail.com wrote:
> > Hi,
> >
> > I want to collect apache web server logs in real time and send it to
> Kafka
> > server. Is there any existing Producer available to do this operation, If
> > not can you please provide a way to implement it.
> >
> > Regards,
> > Sree.
View raw message