spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 诺铁 <noty...@gmail.com>
Subject Re: Does filter on an RDD scan every data item ?
Date Mon, 08 Dec 2014 04:23:03 GMT
there is a

*PartitionPruningRDD*

:: DeveloperApi :: A RDD used to prune RDD partitions/partitions so we can
avoid launching tasks on all partitions. An example use case: If we know
the RDD is partitioned by range, and the execution DAG has a filter on the
key, we can avoid launching tasks on partitions that don't have the range
covering the key.

seems exactly made for the case,  but it's marked as DeveloperApi, anyone
know how to use it?



On Mon, Dec 8, 2014 at 11:31 AM, nsareen <nsareen@gmail.com> wrote:

> @Sowen, would appreciate, if you can explain how would Spark SQL help in my
> scenario..
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Does-filter-on-an-RDD-scan-every-data-item-tp20170p20571.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Mime
View raw message