spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Davies Liu <dav...@databricks.com>
Subject Re: Custom Partitioner
Date Tue, 01 Sep 2015 17:12:18 GMT
You can take the sortByKey as example:
https://github.com/apache/spark/blob/master/python/pyspark/rdd.py#L642

On Tue, Sep 1, 2015 at 3:48 AM, Jem Tucker <jem.tucker@gmail.com> wrote:
> something like...
>
> class RangePartitioner(Partitioner):
> def __init__(self, numParts):
> self.numPartitions = numParts
> self.partitionFunction = rangePartition
> def rangePartition(key):
> # Logic to turn key into a partition id
> return id
>
> On Tue, Sep 1, 2015 at 11:38 AM shahid ashraf <shahid@trialx.com> wrote:
>>
>> Hi
>>
>> I think range partitioner is not available in pyspark, so if we want
>> create one. how should we create that. my question is that.
>>
>> On Tue, Sep 1, 2015 at 3:57 PM, Jem Tucker <jem.tucker@gmail.com> wrote:
>>>
>>> Ah sorry I miss read your question. In pyspark it looks like you just
>>> need to instantiate the Partitioner class with numPartitions and
>>> partitionFunc.
>>>
>>> On Tue, Sep 1, 2015 at 11:13 AM shahid ashraf <shahid@trialx.com> wrote:
>>>>
>>>> Hi
>>>>
>>>> I did not get this, e.g if i need to create a custom partitioner like
>>>> range partitioner.
>>>>
>>>> On Tue, Sep 1, 2015 at 3:22 PM, Jem Tucker <jem.tucker@gmail.com> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> You just need to extend Partitioner and override the numPartitions and
>>>>> getPartition methods, see below
>>>>>
>>>>> class MyPartitioner extends partitioner {
>>>>>   def numPartitions: Int = // Return the number of partitions
>>>>>   def getPartition(key Any): Int = // Return the partition for a given
>>>>> key
>>>>> }
>>>>>
>>>>> On Tue, Sep 1, 2015 at 10:15 AM shahid qadri <shahidashraff@icloud.com>
>>>>> wrote:
>>>>>>
>>>>>> Hi Sparkians
>>>>>>
>>>>>> How can we create a customer partition in pyspark
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>>>>> For additional commands, e-mail: user-help@spark.apache.org
>>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> with Regards
>>>> Shahid Ashraf
>>
>>
>>
>>
>> --
>> with Regards
>> Shahid Ashraf

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message