chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Yang (JIRA)" <>
Subject [jira] [Commented] (CHUKWA-734) Gora Storage System for Chuckwa Logs
Date Sat, 21 Feb 2015 22:44:11 GMT


Eric Yang commented on CHUKWA-734:

I got an error for running TestHBaseWriter unit test:

Test set: org.apache.hadoop.chukwa.datacollection.writer.TestHBaseWriter
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.707 sec <<< FAILURE!
testWriters(org.apache.hadoop.chukwa.datacollection.writer.TestHBaseWriter)  Time elapsed:
0.582 sec  <<< ERROR!
java.lang.IncompatibleClassChangeError: Implementing class
        at java.lang.ClassLoader.defineClass1(Native Method)
        at java.lang.ClassLoader.defineClass(
        at Method)
        at java.lang.ClassLoader.loadClass(
        at sun.misc.Launcher$AppClassLoader.loadClass(
        at java.lang.ClassLoader.loadClass(
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(
        at org.apache.hadoop.hbase.mapreduce.MapreduceTestingShim.<clinit>(
        at org.apache.hadoop.hbase.HBaseTestingUtility.createDirsAndSetProperties(
        at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniDFSCluster(
        at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(
        at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(
        at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(
        at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(
        at org.apache.hadoop.chukwa.datacollection.writer.TestHBaseWriter.setUp(
        at junit.framework.TestCase.runBare(
        at junit.framework.TestResult$1.protect(
        at junit.framework.TestResult.runProtected(
        at junit.framework.TestSuite.runTest(
        at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(
        at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(
        at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(
        at java.lang.reflect.Method.invoke(
        at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(
        at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(
        at org.apache.maven.surefire.booter.SurefireStarter.invokeProvider(
        at org.apache.maven.surefire.booter.SurefireStarter.runSuitesInProcessWhenForked(
        at org.apache.maven.surefire.booter.ForkedBooter.main(

This exception happens when using Hadoop 1.2.1 + HBase 0.98.8 + Hadoop-compat1.  Does Gora
support Hadoop1?
We probably need to setup another profile for enabling Hadoop 1 vs Hadoop 2.

For table schema design and row key design, maybe we can use something like this:

Row Key: [Invert Date]:[Data Type]:[Primary Key]
Column Family: log
Column Name: [Sequence ID]
Timestamp: [log entry timestamp]


Row Key:
Column Family: log
Column Name: 1230
Cell Value: 2013-01-23 12:01:30 INFO This is a log entry.
Timestamp: 1358942490

The inverted date allow the table to be partitioned by hour or day of the month or month more
The usage of column name for consecutive sequence to allow fast retrieval in a linear scan.
  This format is typically good for retrieve a hour worth of logs fast for a node.  Hence,
if we are doing batch scanning of the table in a rolling window via map reduce job at every
hour interval, we get a even spread the work load to multiple map reduce tasks.
Can Gora map sequence ID value to column name in HBase?

> Gora Storage System for Chuckwa Logs
> ------------------------------------
>                 Key: CHUKWA-734
>                 URL:
>             Project: Chukwa
>          Issue Type: New Feature
>          Components: Data Collection
>    Affects Versions: 0.6.0
>            Reporter: Lewis John McGibbney
>             Fix For: 0.6.0
>         Attachments: CHUKWA-734.patch
>   Original Estimate: 5h
>  Remaining Estimate: 5h
> I would like to build a Gora-backed log-to-datastore module for Chuckwa. I am going to
work on this today.
> Gora is an in-memory data modeling and storage abstraction 
> Gora powers the Apache Nutch 2.X software which generates a bunch of log data. Having
a Chuckwa monitoring tool for Nutch would be grand.

This message was sent by Atlassian JIRA

View raw message