spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Lewis <lordjoe2...@gmail.com>
Subject Re: A Spark Design Problem
Date Sat, 01 Nov 2014 17:27:31 GMT
join seems to me the proper approach followed by keying  the fits by KeyID
and using combineByKey to choose the best -
I am implementing that now and will report on performance

On Fri, Oct 31, 2014 at 11:56 AM, Sonal Goyal <sonalgoyal4@gmail.com> wrote:

> Does the following help?
>
> JavaPairRDD<bin,key> join with JavaPairRDD<bin,lock>
>
> If you partition both RDDs by the bin id, I think you should be able to
> get what you want.
>
> Best Regards,
> Sonal
> Nube Technologies <http://www.nubetech.co>
>
> <http://in.linkedin.com/in/sonalgoyal>
>
>
>>
>> On Fri, Oct 31, 2014 at 5:44 PM, Steve Lewis <lordjoe2000@gmail.com>
>> wrote:
>>
>>>
>>>  The original problem is in biology but the following captures the CS
>>> issues, Assume ...
>>>
>>

Mime
View raw message