spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcelo Vanzin <van...@cloudera.com>
Subject Re: RFC: Remove "HBaseTest" from examples?
Date Tue, 19 Apr 2016 17:50:14 GMT
You're completely missing my point. I'm saying that HBase's current
support, even if there are bugs or things that still need to be done,
is much better than the Spark example, which is basically a call to
"SparkContext.hadoopRDD".

Spark's example is not helpful in learning how to build an HBase
application on Spark, and clashes head on with how the HBase
developers think it should be done. That, and because it brings too
many dependencies for something that is not really useful, is why I'm
suggesting removing it.


On Tue, Apr 19, 2016 at 10:47 AM, Ted Yu <yuzhihong@gmail.com> wrote:
> There is an Open JIRA for fixing the documentation: HBASE-15473
>
> I would say the refguide link you provided should not be considered as
> complete.
>
> Note it is marked as Blocker by Sean B.
>
> On Tue, Apr 19, 2016 at 10:43 AM, Marcelo Vanzin <vanzin@cloudera.com>
> wrote:
>>
>> You're entitled to your own opinions.
>>
>> While you're at it, here's some much better documentation, from the
>> HBase project themselves, than what the Spark example provides:
>> http://hbase.apache.org/book.html#spark
>>
>> On Tue, Apr 19, 2016 at 10:41 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>> > bq. it's actually in use right now in spite of not being in any upstream
>> > HBase release
>> >
>> > If it is not in upstream, then it is not relevant for discussion on
>> > Apache
>> > mailing list.
>> >
>> > On Tue, Apr 19, 2016 at 10:38 AM, Marcelo Vanzin <vanzin@cloudera.com>
>> > wrote:
>> >>
>> >> Alright, if you prefer, I'll say "it's actually in use right now in
>> >> spite of not being in any upstream HBase release", and it's more
>> >> useful than a single example file in the Spark repo for those who
>> >> really want to integrate with HBase.
>> >>
>> >> Spark's example is really very trivial (just uses one of HBase's input
>> >> formats), which makes it not very useful as a blueprint for developing
>> >> HBase apps with Spark.
>> >>
>> >> On Tue, Apr 19, 2016 at 10:28 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>> >> > bq. I wouldn't call it "incomplete".
>> >> >
>> >> > I would call it incomplete.
>> >> >
>> >> > Please see HBASE-15333 'Enhance the filter to handle short, integer,
>> >> > long,
>> >> > float and double' which is a bug fix.
>> >> >
>> >> > Please exclude presence of related of module in vendor distro from
>> >> > this
>> >> > discussion.
>> >> >
>> >> > Thanks
>> >> >
>> >> > On Tue, Apr 19, 2016 at 10:23 AM, Marcelo Vanzin
>> >> > <vanzin@cloudera.com>
>> >> > wrote:
>> >> >>
>> >> >> On Tue, Apr 19, 2016 at 10:20 AM, Ted Yu <yuzhihong@gmail.com>
>> >> >> wrote:
>> >> >> > I want to note that the hbase-spark module in HBase is incomplete.
>> >> >> > Zhan
>> >> >> > has
>> >> >> > several patches pending review.
>> >> >>
>> >> >> I wouldn't call it "incomplete". Lots of functionality is there,
>> >> >> which
>> >> >> doesn't mean new ones, or more efficient implementations of existing
>> >> >> ones, can't be added.
>> >> >>
>> >> >> > hbase-spark module is currently only in master branch which
would
>> >> >> > be
>> >> >> > released as 2.0
>> >> >>
>> >> >> Just as a side note, it's part of CDH 5.7.0, not that it matters
>> >> >> much
>> >> >> for upstream HBase.
>> >> >>
>> >> >> --
>> >> >> Marcelo
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Marcelo
>> >
>> >
>>
>>
>>
>> --
>> Marcelo
>
>



-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Mime
View raw message