spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <so...@cloudera.com>
Subject Re: sparkcontext.objectFile return thousands of partitions
Date Thu, 22 Jan 2015 19:01:29 GMT
Yes, that second argument is what I was referring to, but yes it's a
*minimum*, oops, right. OK, you will want to coalesce then, indeed.

On Thu, Jan 22, 2015 at 6:51 PM, Wang, Ningjun (LNG-NPV)
<ningjun.wang@lexisnexis.com> wrote:
> Ø  If you know that this number is too high you can request a number of
> partitions when you read it.
>
>
>
> How to do that? Can you give a code snippet? I want to read it into 8
> partitions, so I do
>
>
>
> val rdd2 = sc.objectFile[LabeledPoint]( (“file:///tmp/mydir”, 8)
>
> However rdd2 contains thousands of partitions instead of 8 partitions
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message