spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shixiong Zhu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-2075) Anonymous classes are missing from Spark distribution
Date Thu, 18 Dec 2014 11:01:16 GMT

    [ https://issues.apache.org/jira/browse/SPARK-2075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14251507#comment-14251507
] 

Shixiong Zhu commented on SPARK-2075:
-------------------------------------

Dig deeply and found weird things:

If I used `mvn -Dhadoop.version=1.2.1 -DskipTests clean package -pl core -am` to compile,
the `saveAsTextFile` will be:
{noformat}
public void saveAsTextFile(java.lang.String);
  Code:
   0:   aload_0
   1:   new     #1577; //class org/apache/spark/rdd/RDD$$anonfun$27
   4:   dup
   5:   aload_0
   6:   invokespecial   #1578; //Method org/apache/spark/rdd/RDD$$anonfun$27."<init>":(Lorg/apache/spark/rdd/RDD;)V
   9:   getstatic       #439; //Field scala/reflect/ClassTag$.MODULE$:Lscala/reflect/ClassTag$;
   12:  ldc_w   #441; //class scala/Tuple2
   15:  invokevirtual   #445; //Method scala/reflect/ClassTag$.apply:(Ljava/lang/Class;)Lscala/reflect/ClassTag;
   18:  invokevirtual   #447; //Method map:(Lscala/Function1;Lscala/reflect/ClassTag;)Lorg/apache/spark/rdd/RDD;
   21:  astore_2
   22:  getstatic       #439; //Field scala/reflect/ClassTag$.MODULE$:Lscala/reflect/ClassTag$;
   25:  ldc_w   #1580; //class org/apache/hadoop/io/NullWritable
   28:  invokevirtual   #445; //Method scala/reflect/ClassTag$.apply:(Ljava/lang/Class;)Lscala/reflect/ClassTag;
   31:  astore_3
   32:  getstatic       #439; //Field scala/reflect/ClassTag$.MODULE$:Lscala/reflect/ClassTag$;
   35:  ldc_w   #1582; //class org/apache/hadoop/io/Text
   38:  invokevirtual   #445; //Method scala/reflect/ClassTag$.apply:(Ljava/lang/Class;)Lscala/reflect/ClassTag;
   41:  astore  4
   43:  getstatic       #21; //Field org/apache/spark/rdd/RDD$.MODULE$:Lorg/apache/spark/rdd/RDD$;
   46:  aload_2
   47:  invokevirtual   #23; //Method org/apache/spark/rdd/RDD$.rddToPairRDDFunctions$default$4:(Lorg/apache/spark/rdd/RDD;)Lscala/runtime/Null$;
   50:  astore  5
   52:  getstatic       #21; //Field org/apache/spark/rdd/RDD$.MODULE$:Lorg/apache/spark/rdd/RDD$;
   55:  aload_2
   56:  aload_3
   57:  aload   4
   59:  aload   5
   61:  pop
   62:  aconst_null
   63:  invokevirtual   #47; //Method org/apache/spark/rdd/RDD$.rddToPairRDDFunctions:(Lorg/apache/spark/rdd/RDD;Lscala/reflect/ClassTag;Lscala/reflect/ClassTag;Lscala/math/Ordering;)Lorg/apache/spark/rdd/PairRDDFunctions;
   66:  aload_1
   67:  getstatic       #439; //Field scala/reflect/ClassTag$.MODULE$:Lscala/reflect/ClassTag$;
   70:  ldc_w   #1584; //class org/apache/hadoop/mapred/TextOutputFormat
   73:  invokevirtual   #445; //Method scala/reflect/ClassTag$.apply:(Ljava/lang/Class;)Lscala/reflect/ClassTag;
   76:  invokevirtual   #1588; //Method org/apache/spark/rdd/PairRDDFunctions.saveAsHadoopFile:(Ljava/lang/String;Lscala/reflect/ClassTag;)V
   79:  return
{noformat}

If I used `mvn -Pyarn -Phadoop-2.2 -Dhadoop.version=2.2.0 -DskipTests clean package -pl core
-am` to compile, the `saveAsTextFile` is different:
{noformat}
public void saveAsTextFile(java.lang.String);
  Code:
   0:   getstatic       #21; //Field org/apache/spark/rdd/RDD$.MODULE$:Lorg/apache/spark/rdd/RDD$;
   3:   aload_0
   4:   new     #1577; //class org/apache/spark/rdd/RDD$$anonfun$saveAsTextFile$1
   7:   dup
   8:   aload_0
   9:   invokespecial   #1578; //Method org/apache/spark/rdd/RDD$$anonfun$saveAsTextFile$1."<init>":(Lorg/apache/spark/rdd/RDD;)V
   12:  getstatic       #439; //Field scala/reflect/ClassTag$.MODULE$:Lscala/reflect/ClassTag$;
   15:  ldc_w   #441; //class scala/Tuple2
   18:  invokevirtual   #445; //Method scala/reflect/ClassTag$.apply:(Ljava/lang/Class;)Lscala/reflect/ClassTag;
   21:  invokevirtual   #447; //Method map:(Lscala/Function1;Lscala/reflect/ClassTag;)Lorg/apache/spark/rdd/RDD;
   24:  getstatic       #439; //Field scala/reflect/ClassTag$.MODULE$:Lscala/reflect/ClassTag$;
   27:  ldc_w   #1580; //class org/apache/hadoop/io/NullWritable
   30:  invokevirtual   #445; //Method scala/reflect/ClassTag$.apply:(Ljava/lang/Class;)Lscala/reflect/ClassTag;
   33:  getstatic       #439; //Field scala/reflect/ClassTag$.MODULE$:Lscala/reflect/ClassTag$;
   36:  ldc_w   #1582; //class org/apache/hadoop/io/Text
   39:  invokevirtual   #445; //Method scala/reflect/ClassTag$.apply:(Ljava/lang/Class;)Lscala/reflect/ClassTag;
   42:  getstatic       #1587; //Field scala/math/Ordering$.MODULE$:Lscala/math/Ordering$;
   45:  getstatic       #471; //Field scala/Predef$.MODULE$:Lscala/Predef$;
   48:  invokevirtual   #1591; //Method scala/Predef$.conforms:()Lscala/Predef$$less$colon$less;
   51:  invokevirtual   #1595; //Method scala/math/Ordering$.ordered:(Lscala/Function1;)Lscala/math/Ordering;
   54:  invokevirtual   #47; //Method org/apache/spark/rdd/RDD$.rddToPairRDDFunctions:(Lorg/apache/spark/rdd/RDD;Lscala/reflect/ClassTag;Lscala/reflect/ClassTag;Lscala/math/Ordering;)Lorg/apache/spark/rdd/PairRDDFunctions;
   57:  aload_1
   58:  getstatic       #439; //Field scala/reflect/ClassTag$.MODULE$:Lscala/reflect/ClassTag$;
   61:  ldc_w   #1597; //class org/apache/hadoop/mapred/TextOutputFormat
   64:  invokevirtual   #445; //Method scala/reflect/ClassTag$.apply:(Ljava/lang/Class;)Lscala/reflect/ClassTag;
   67:  invokevirtual   #1601; //Method org/apache/spark/rdd/PairRDDFunctions.saveAsHadoopFile:(Ljava/lang/String;Lscala/reflect/ClassTag;)V
   70:  return
{noformat}

Note: in hadoop 1.2.1, saveAsTextFile use the default `Ordering` value `null`, while in hadoop
2.2.0, saveAsTextFile will use `Ordering.ordered` to create a new `Ordering`.


> Anonymous classes are missing from Spark distribution
> -----------------------------------------------------
>
>                 Key: SPARK-2075
>                 URL: https://issues.apache.org/jira/browse/SPARK-2075
>             Project: Spark
>          Issue Type: Bug
>          Components: Build, Spark Core
>    Affects Versions: 1.0.0
>            Reporter: Paul R. Brown
>            Priority: Critical
>
> Running a job built against the Maven dep for 1.0.0 and the hadoop1 distribution produces:
> {code}
> java.lang.ClassNotFoundException:
> org.apache.spark.rdd.RDD$$anonfun$saveAsTextFile$1
> {code}
> Here's what's in the Maven dep as of 1.0.0:
> {code}
> jar tvf ~/.m2/repository/org/apache/spark/spark-core_2.10/1.0.0/spark-core_2.10-1.0.0.jar
| grep 'rdd/RDD' | grep 'saveAs'
>   1519 Mon May 26 13:57:58 PDT 2014 org/apache/spark/rdd/RDD$anonfun$saveAsTextFile$1.class
>   1560 Mon May 26 13:57:58 PDT 2014 org/apache/spark/rdd/RDD$anonfun$saveAsTextFile$2.class
> {code}
> And here's what's in the hadoop1 distribution:
> {code}
> jar tvf spark-assembly-1.0.0-hadoop1.0.4.jar| grep 'rdd/RDD' | grep 'saveAs'
> {code}
> I.e., it's not there.  It is in the hadoop2 distribution:
> {code}
> jar tvf spark-assembly-1.0.0-hadoop2.2.0.jar| grep 'rdd/RDD' | grep 'saveAs'
>   1519 Mon May 26 07:29:54 PDT 2014 org/apache/spark/rdd/RDD$anonfun$saveAsTextFile$1.class
>   1560 Mon May 26 07:29:54 PDT 2014 org/apache/spark/rdd/RDD$anonfun$saveAsTextFile$2.class
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message