spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yu Wei <yu20...@hotmail.com>
Subject Re: Is it good choice to use DAO to store results generated by spark application?
Date Wed, 20 Jul 2016 14:22:56 GMT
I'm beginner to big data. I don't have too much knowledge about hbase/hive.

What's the difference between hbase and hive/hdfs for storing data for analytics?


Thanks,

Jared

________________________________
From: ayan guha <guha.ayan@gmail.com>
Sent: Wednesday, July 20, 2016 9:34:24 PM
To: Rabin Banerjee
Cc: user; Yu Wei; Deepak Sharma
Subject: Re: Is it good choice to use DAO to store results generated by spark application?


Just as a rain check, saving data to hbase for analytics may not be the best choice. Any specific
reason for not using hdfs or hive?

On 20 Jul 2016 20:57, "Rabin Banerjee" <dev.rabin.banerjee@gmail.com<mailto:dev.rabin.banerjee@gmail.com>>
wrote:
Hi Wei ,

You can do something like this ,


foreachPartition( (part) => {
    val conn = ConnectionFactory.createConnection(HBaseConfiguration.create());
    val table = conn.getTable(TableName.valueOf(tablename));
    //part.foreach((inp)=>{println(inp);table.put(inp)}) //This is line by line put
        table.put(part.toList.asJava)
    table.close();
    conn.close();


\

Now if you want to wrap it inside a DAO,its upto you. Making DAO will abstract thing , but
ultimately going to use the same code .

Note: Always use Hbase ConnectionFactory to get connection ,and dump data per partition basis.

Regards,
Rabin Banerjee


On Wed, Jul 20, 2016 at 12:06 PM, Yu Wei <yu2003w@hotmail.com<mailto:yu2003w@hotmail.com>>
wrote:

I need to write all data received from MQTT data into hbase for further processing.

They're not final result.  I also need to read the data from hbase for analysis.


Is it good choice to use DAO in such situation?


Thx,

Jared


________________________________
From: Deepak Sharma <deepakmca05@gmail.com<mailto:deepakmca05@gmail.com>>
Sent: Wednesday, July 20, 2016 12:34:07 PM
To: Yu Wei
Cc: spark users
Subject: Re: Is it good choice to use DAO to store results generated by spark application?


I am using DAO in spark application to write the final computation to Cassandra  and it performs
well.
What kinds of issues you foresee using DAO for hbase ?

Thanks
Deepak

On 19 Jul 2016 10:04 pm, "Yu Wei" <yu2003w@hotmail.com<mailto:yu2003w@hotmail.com>>
wrote:

Hi guys,


I write spark application and want to store results generated by spark application to hbase.

Do I need to access hbase via java api directly?

Or is it better choice to use DAO similar as traditional RDBMS?  I suspect that there is major
performance downgrade and other negative impacts using DAO. However, I have little knowledge
in this field.


Any advice?


Thanks,

Jared




Mime
View raw message