chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lewis John McGibbney (JIRA)" <>
Subject [jira] [Updated] (CHUKWA-734) Gora Storage System for Chuckwa Logs
Date Fri, 20 Feb 2015 08:04:12 GMT


Lewis John McGibbney updated CHUKWA-734:
    Attachment: CHUKWA-734.patch

We just released Apache Gora 0.6 today so i thought I would put this together with the aim
of building upon the initial patch.

Initial patch which contains 
 * implementing the GoraWriter, I've added as much documentation as I see it
 * building a Gora implementation of the Chukwa Chunk e.g data and metadata
 * implementation of an HBase mapping (gora-hbase-mapping.xml)
 * addition of file
 * definition of gora-hbase dependency as well as the required gora-hadoop-X dependencies
within pom.xml

What you need to do to get it working
 * uncomment the gora-hbase dependency within pom.xml
 * use GoraWriter as the writer ikplementation within agent-conf (please see patch) for addition
to this file
 * mvn install

What I would like from you guys
 * try giving it a spin and see if you can use it... if you can't then I would very much appreciate
the feedback.

Some notes
 * HBase support in Gora 0.6 is 0.98.8-hadoop2
 * Hadoop support is 1.2.1 and 2.5.2
 * We use Avro for serialization, hence everything will be in HBase as Avro serialized data.

Some things [~eyang] and myself still need to sort out
 * What does primary key look like?

Next steps
 * I get feedback on this
 * I think about primary key support
 * I write some tests using Gora's MemStore to simulate mapping Chukwa chunk data to a Gora

> Gora Storage System for Chuckwa Logs
> ------------------------------------
>                 Key: CHUKWA-734
>                 URL:
>             Project: Chukwa
>          Issue Type: New Feature
>          Components: Data Collection
>    Affects Versions: 0.6.0
>            Reporter: Lewis John McGibbney
>             Fix For: 0.6.0
>         Attachments: CHUKWA-734.patch
>   Original Estimate: 5h
>  Remaining Estimate: 5h
> I would like to build a Gora-backed log-to-datastore module for Chuckwa. I am going to
work on this today.
> Gora is an in-memory data modeling and storage abstraction 
> Gora powers the Apache Nutch 2.X software which generates a bunch of log data. Having
a Chuckwa monitoring tool for Nutch would be grand.

This message was sent by Atlassian JIRA

View raw message