spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Kim <benb...@hotmail.com>
Subject RE: Writing to HBase
Date Thu, 12 Dec 2013 18:59:15 GMT
Hi Philip,
I got this bit of code to work in the spark-shell using scala against our dev hbase cluster.
-bash-4.1$export SPARK_CLASSPATH=$SPARK_CLASSPATH:/opt/cloudera/parcels/CDH/lib/hbase/hbase.jar:/opt/cloudera/parcels/CDH/lib/hbase/conf:/opt/cloudera/parcels/CDH/lib/hadoop/conf
-bash-4.1$./spark-shellscala>import org.apache.hadoop.hbase.HBaseConfigurationscala>import
org.apache.hadoop.hbase.client._scala>import org.apache.hadoop.hbase.util.Bytesscala>val
conf = HBaseConfiguration.create()scala>val table = new HTable(conf, "my_items")scala>val
p = new Put(Bytes.toBytes("strawberry-fruit"))scala>p.add(Bytes.toBytes("item"),Bytes.toBytes("item"),Bytes.toBytes("strawberry"))scala>p.add(Bytes.toBytes("item"),Bytes.toBytes("category"),Bytes.toBytes("fruit"))scala>p.add(Bytes.toBytes("item"),Bytes.toBytes("price"),Bytes.toBytes("0.35"))scala>table.put(p)
It put the new row "strawberry-fruit" into an hbase table.
Sorry, but I have another newbie question. How do I add those CLASSPATH dependencies when
I want to compile a streaming jar in sbt so that the hbase configs are automatically used?
Thanks,Ben
Date: Thu, 5 Dec 2013 10:24:02 -0700
From: philip.ogren@oracle.com
To: user@spark.incubator.apache.org
Subject: Re: Writing to HBase


  
    
  
  
    Here's a good place to start:

    

    http://mail-archives.apache.org/mod_mbox/incubator-spark-user/201311.mbox/%3CCACyZca3ASKwD-tuJHQi1805BN7ScTguAoRuHd5xTxCSUL1aNvQ@mail.gmail.com%3E

    

    

    On 12/5/2013 10:18 AM, Benjamin Kim
      wrote:

    
    
      
      Does anyone have an example or some sort of
        starting point code when writing from Spark Streaming into
        HBase?
        

        
        We currently stream ad server event log data using Flume-NG
          to tail log entries, collect them, and put them directly into
          a HBase table. We would like to do the same with Spark
          Streaming. But, we would like to do the data massaging and
          simple data analysis before. This will cut down the steps in
          prepping data and the number of tables for our data scientists
          and real-time feedback systems.
        

        
        Thanks,
        Ben
      
    
    
 		 	   		  
Mime
View raw message