spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sujit Pal <sujitatgt...@gmail.com>
Subject Re: pyspark mappartions ()
Date Sat, 14 May 2016 16:23:58 GMT
I built this recently using the accepted answer on this SO page:

http://stackoverflow.com/questions/26741714/how-does-the-pyspark-mappartitions-function-work/26745371

-sujit

On Sat, May 14, 2016 at 7:00 AM, Mathieu Longtin <mathieu@closetwork.org>
wrote:

> From memory:
> def processor(iterator):
>   for item in iterator:
>     newitem = do_whatever(item)
>     yield newitem
>
> newdata = data.mapPartition(processor)
>
> Basically, your function takes an iterator as an argument, and must either
> be an iterator or return one.
>
> On Sat, May 14, 2016 at 12:39 AM Abi <analyst.tech.jobs@gmail.com> wrote:
>
>>
>>
>> On Tue, May 10, 2016 at 2:20 PM, Abi <analyst.tech.jobs@gmail.com> wrote:
>>
>>> Is there any example of this ? I want to see how you write the the
>>> iterable example
>>
>>
>> --
> Mathieu Longtin
> 1-514-803-8977
>

Mime
View raw message