spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yan" <>
Subject RE: Package Release Annoucement: Spark SQL on HBase "Astro"
Date Mon, 03 Aug 2015 17:45:49 GMT
HBase 1.0 should work fine even though we have not completed full tests yet. Support of 1.1
should be able to be added with a minimal effort.



From: Ted Yu []
Sent: Monday, August 03, 2015 10:33 AM
To: Bing Xiao (Bing)
Cc:;; Yan
Subject: Re: Package Release Annoucement: Spark SQL on HBase "Astro"

When I tried to compile against hbase 1.1.1, I got:

[ERROR] /home/hbase/ssoh/src/main/scala/org/apache/spark/sql/hbase/SparkSqlRegionObserver.scala:124:
overloaded method next needs result type
[ERROR]   override def next(result: java.util.List[Cell], limit: Int) = next(result)

Is there plan to support hbase 1.x ?


On Wed, Jul 22, 2015 at 4:53 PM, Bing Xiao (Bing) <<>>
We are happy to announce the availability of the Spark SQL on HBase 1.0.0 release.
The main features in this package, dubbed “Astro”, include:

•         Systematic and powerful handling of data pruning and intelligent scan, based on
partial evaluation technique

•         HBase pushdown capabilities like custom filters and coprocessor to support ultra
low latency processing

•         SQL, Data Frame support

•         More SQL capabilities made possible (Secondary index, bloom filter, Primary Key,
Bulk load, Update)

•         Joins with data from other sources

•         Python/Java/Scala support

•         Support latest Spark 1.4.0 release

The tests by Huawei team and community contributors covered the areas: bulk load; projection
pruning; partition pruning; partial evaluation; code generation; coprocessor; customer filtering;
DML; complex filtering on keys and non-keys; Join/union with non-Hbase data; Data Frame; multi-column
family test.  We will post the test results including performance tests the middle of August.
You are very welcomed to try out or deploy the package, and help improve the integration tests
with various combinations of the settings, extensive Data Frame tests, complex join/union
test and extensive performance tests.  Please use the “Issues” “Pull Requests” links
at this package homepage, if you want to report bugs, improvement or feature requests.
Special thanks to project owner and technical leader Yan Zhou, Huawei global team, community
contributors and Databricks.   Databricks has been providing great assistance from the design
to the release.
“Astro”, the Spark SQL on HBase package will be useful for ultra low latency query and
analytics of large scale data sets in vertical enterprises. We will continue to work with
the community to develop new features and improve code base.  Your comments and suggestions
are greatly appreciated.

Yan Zhou / Bing Xiao
Huawei Big Data team

View raw message