spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nathan Kronenfeld <nkronenf...@oculusinfo.com>
Subject Re: Problem with tests
Date Sat, 23 Nov 2013 13:00:27 GMT
https://github.com/apache/incubator-spark/pull/18


On Fri, Nov 22, 2013 at 6:35 PM, Reynold Xin <reynoldx@gmail.com> wrote:

> Can you provide a link to your pull request?
>
>
> On Sat, Nov 23, 2013 at 5:02 AM, Nathan Kronenfeld <
> nkronenfeld@oculusinfo.com> wrote:
>
> > Actually, looking into recent commits, it looks like my hunch may be
> > exactly correct:
> >
> >
> https://github.com/apache/incubator-spark/commit/f639b65eabcc8666b74af8f13a37c5fdf7e0185f
> > "PartitionPruningRDD is using index from parent"
> >
> > Is there anyone who can explain why this new behavior is preferable?
>  And,
> > if it's staying, can suggest a way to fix my tests for this case?
> >
> > Thanks again,
> >                  Nathan
> >
> >
> > On Fri, Nov 22, 2013 at 3:56 PM, Nathan Kronenfeld <
> > nkronenfeld@oculusinfo.com> wrote:
> >
> > > Hi there.
> > >
> > > I have a problem with the unit tests on a pull request I'm trying to
> tie
> > > up.  The changes deal with partition-related functions.
> > >
> > > In particular, the tests I have that test an append-to-partition
> function
> > > work fine on my own machine, but fail on the build machine (
> > >
> >
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/2152/console
> > > ).
> > >
> > > The failure seems to stem from pulling a single partition out of the
> set.
> > > In either case, when I work on the full dataset:
> > >
> > > UnionRDD[11] at apply at FunSuite.scala:1265 (4 partitions)
> > >   UnionRDD[9] at apply at FunSuite.scala:1265 (3 partitions)
> > >     ParallelCollectionRDD[8] at apply at FunSuite.scala:1265 (1
> > partitions)
> > >     MapPartitionsWithContextRDD[7] at apply at FunSuite.scala:1265 (2
> > partitions)
> > >       ParallelCollectionRDD[4] at apply at FunSuite.scala:1265 (2
> > partitions)
> > >   ParallelCollectionRDD[10] at apply at FunSuite.scala:1265 (1
> > partitions)
> > >
> > >
> > > It seems to work.  When I pull one partition out of this, by wrapping a
> > PartitionPruningRDD around it (pruning out everything but partition 2):
> > >
> > > PartitionPruningRDD[12] at apply at FunSuite.scala:1265 (1 partitions)
> > >   UnionRDD[11] at apply at FunSuite.scala:1265 (4 partitions)
> > >     UnionRDD[9] at apply at FunSuite.scala:1265 (3 partitions)
> > >       ParallelCollectionRDD[8] at apply at FunSuite.scala:1265 (1
> > partitions)
> > >       MapPartitionsWithContextRDD[7] at apply at FunSuite.scala:1265 (2
> > partitions)
> > >         ParallelCollectionRDD[4] at apply at FunSuite.scala:1265 (2
> > partitions)
> > >     ParallelCollectionRDD[10] at apply at FunSuite.scala:1265 (1
> > partitions)
> > >
> > >
> > > In this case, my local machine and the build machine seem to act
> > > differently.
> > >
> > > On my local machine, what is in the inner ParallelCollection partition
> #2
> > > shows up in the MapPartitionsWithContextRDD as partition #2 still.  On
> > the
> > > build machine, this same partition shows up in the later RDD as
> partition
> > > #0 - presumably because everything else is pruned out, but that pruning
> > > should happen at an outer level, shouldn't it?
> > >
> > > Does anyone know why the build machine would act different from locally
> > > here?
> > >
> > > Also, sadly, this worked fine two days ago.
> > >
> > > My only thought is that perhaps the PullRequestBuilder does a merge
> with
> > > current code, and someone broke this in the last day or two?  Past
> that,
> > > I'm at a bit of a loss.
> > >
> > > Thanks,
> > >                     -Nathan
> > >
> > >
> > > --
> > >
> > > Nathan Kronenfeld
> > > Senior Visualization Developer
> > > Oculus Info Inc
> > > 2 Berkeley Street, Suite 600,
> > > Toronto, Ontario M5A 4J5
> > > Phone:  +1-416-203-3003 x 238
> > > Email:  nkronenfeld@oculusinfo.com
> > >
> >
> >
> >
> > --
> > Nathan Kronenfeld
> > Senior Visualization Developer
> > Oculus Info Inc
> > 2 Berkeley Street, Suite 600,
> > Toronto, Ontario M5A 4J5
> > Phone:  +1-416-203-3003 x 238
> > Email:  nkronenfeld@oculusinfo.com
> >
>



-- 
Nathan Kronenfeld
Senior Visualization Developer
Oculus Info Inc
2 Berkeley Street, Suite 600,
Toronto, Ontario M5A 4J5
Phone:  +1-416-203-3003 x 238
Email:  nkronenfeld@oculusinfo.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message