kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Kim <bbuil...@gmail.com>
Subject Re: Spark on Kudu
Date Wed, 24 Feb 2016 23:41:43 GMT

It looks like it fulfills most of the basic requirements (kudu RDD, kudu DStream) in KUDU-1214.
Am I right? Besides shoring up more Spark SQL functionality (Dataframes) and doing the documentation,
what more needs to be done? Optimizations?

I believe that it’s a good place to start using Spark with Kudu and compare it to HBase
with Spark (not clean).


> On Feb 24, 2016, at 3:10 PM, Jean-Daniel Cryans <jdcryans@apache.org> wrote:
> AFAIK no one is working on it, but we did manage to get this in for 0.7.0: https://issues.cloudera.org/browse/KUDU-1321
> It's a really simple wrapper, and yes you can use SparkSQL on Kudu, but it will require
a lot more work to make it fast/useful.
> Hope this helps,
> J-D
> On Wed, Feb 24, 2016 at 3:08 PM, Benjamin Kim <bbuild11@gmail.com <mailto:bbuild11@gmail.com>>
> I see this KUDU-1214 <https://issues.cloudera.org/browse/KUDU-1214> targeted for
0.8.0, but I see no progress on it. When this is complete, will this mean that Spark will
be able to work with Kudu both programmatically and as a client via Spark SQL? Or is there
more work that needs to be done on the Spark side for it to work?
> Just curious.
> Cheers,
> Ben

View raw message