spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cheng Lian <>
Subject Re: Join : Giving incorrect result
Date Wed, 04 Jun 2014 14:12:36 GMT
Hi Ajay, would you mind to synthesise a minimum code snippet that can
reproduce this issue and paste it here?

On Wed, Jun 4, 2014 at 8:32 PM, Ajay Srivastava <>

> Hi,
> I am doing join of two RDDs which giving different results ( counting
> number of records ) each time I run this code on same input.
> The input files are large enough to be divided in two splits. When the
> program runs on two workers with single core assigned to these, output is
> consistent and looks correct. But when single worker is used with two or
> more than two cores, the result seems to be random. Every time, count of
> joined record is different.
> Does this sound like a defect or I need to take care of something while
> using join ? I am using spark-0.9.1.
> Regards
> Ajay

View raw message