spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cheng, Hao" <hao.ch...@intel.com>
Subject RE: Spark SQL Custom Predicate Pushdown
Date Fri, 16 Jan 2015 07:07:06 GMT
The Data Source API probably work for this purpose.
It support the column pruning and the Predicate Push Down:
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala

Examples also can be found in the unit test:
https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/sources


From: Corey Nolet [mailto:cjnolet@gmail.com]
Sent: Friday, January 16, 2015 1:51 PM
To: user
Subject: Spark SQL Custom Predicate Pushdown

I have document storage services in Accumulo that I'd like to expose to Spark SQL. I am able
to push down predicate logic to Accumulo to have it perform only the seeks necessary on each
tablet server to grab the results being asked for.

I'm interested in using Spark SQL to push those predicates down to the tablet servers. Where
wouldI begin my implementation? Currently I have an input format which accepts a "query object"
that gets pushed down. How would I extract this information from the HiveContext/SQLContext
to be able to push this down?
Mime
View raw message