spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aleksander Eskilson (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-18016) Code Generation: Constant Pool Past Limit for Wide/Nested Dataset
Date Thu, 20 Oct 2016 14:52:58 GMT

     [ https://issues.apache.org/jira/browse/SPARK-18016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Aleksander Eskilson updated SPARK-18016:
----------------------------------------
    Description: 
When attempting to encode collections of large Java objects to Datasets having very wide or
deeply nested schemas, code generation can fail, yielding:

{code}
Caused by: org.codehaus.janino.JaninoRuntimeException: Constant pool for class org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection
has grown past JVM limit of 0xFFFF
	at org.codehaus.janino.util.ClassFile.addToConstantPool(ClassFile.java:499)
	at org.codehaus.janino.util.ClassFile.addConstantNameAndTypeInfo(ClassFile.java:439)
	at org.codehaus.janino.util.ClassFile.addConstantMethodrefInfo(ClassFile.java:358)
	at org.codehaus.janino.UnitCompiler.writeConstantMethodrefInfo(UnitCompiler.java:11114)
	at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:4547)
	at org.codehaus.janino.UnitCompiler.access$7500(UnitCompiler.java:206)
	at org.codehaus.janino.UnitCompiler$12.visitMethodInvocation(UnitCompiler.java:3774)
	at org.codehaus.janino.UnitCompiler$12.visitMethodInvocation(UnitCompiler.java:3762)
	at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:4328)
	at org.codehaus.janino.UnitCompiler.compileGet(UnitCompiler.java:3762)
	at org.codehaus.janino.UnitCompiler.compileGetValue(UnitCompiler.java:4933)
	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:3180)
	at org.codehaus.janino.UnitCompiler.access$5000(UnitCompiler.java:206)
	at org.codehaus.janino.UnitCompiler$9.visitMethodInvocation(UnitCompiler.java:3151)
	at org.codehaus.janino.UnitCompiler$9.visitMethodInvocation(UnitCompiler.java:3139)
	at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:4328)
	at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:3139)
	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:2112)
	at org.codehaus.janino.UnitCompiler.access$1700(UnitCompiler.java:206)
	at org.codehaus.janino.UnitCompiler$6.visitExpressionStatement(UnitCompiler.java:1377)
	at org.codehaus.janino.UnitCompiler$6.visitExpressionStatement(UnitCompiler.java:1370)
	at org.codehaus.janino.Java$ExpressionStatement.accept(Java.java:2558)
	at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1370)
	at org.codehaus.janino.UnitCompiler.compileStatements(UnitCompiler.java:1450)
	at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:2811)
	at org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1262)
	at org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1234)
	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:538)
	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:890)
	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:894)
	at org.codehaus.janino.UnitCompiler.access$600(UnitCompiler.java:206)
	at org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:377)
	at org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:369)
	at org.codehaus.janino.Java$MemberClassDeclaration.accept(Java.java:1128)
	at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:369)
	at org.codehaus.janino.UnitCompiler.compileDeclaredMemberTypes(UnitCompiler.java:1209)
	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:564)
	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:420)
	at org.codehaus.janino.UnitCompiler.access$400(UnitCompiler.java:206)
	at org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:374)
	at org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:369)
	at org.codehaus.janino.Java$AbstractPackageMemberClassDeclaration.accept(Java.java:1309)
	at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:369)
	at org.codehaus.janino.UnitCompiler.compileUnit(UnitCompiler.java:345)
	at org.codehaus.janino.SimpleCompiler.compileToClassLoader(SimpleCompiler.java:396)
	at org.codehaus.janino.ClassBodyEvaluator.compileToClass(ClassBodyEvaluator.java:311)
	at org.codehaus.janino.ClassBodyEvaluator.cook(ClassBodyEvaluator.java:229)
	at org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:196)
	at org.codehaus.commons.compiler.Cookable.cook(Cookable.java:91)
	at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$CodeGenerator$$doCompile(CodeGenerator.scala:905)
	... 35 more
{code}

During generation of the code for SpecificUnsafeProjection, all the mutable variables are
declared up front. If there are too many, it seems it perhaps exceeds some type of resource
limit.

This issue seems related to (but is not fixed by) SPARK-17702, which itself was about the
size of individual methods growing beyond the 64 KB limit. SPARK-17702 was resolved by breaking
extractions into smaller methods, but does not seem to have resolved this issue.

I've created a small project [1] where I declare a list of "wide" and "nested" Bean objects
that I attempt to encode to a Dataset. This code can trigger the failure for Spark 2.1.0-SNAPSHOT.


[1] - https://github.com/bdrillard/spark-codegen-error

  was:
When attempting to encode collections of large Java objects to Datasets having very wide or
deeply nested schemas, code generation can fail, yielding:

{code}
Caused by: org.codehaus.janino.JaninoRuntimeException: Constant pool for class org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection
has grown past JVM limit of 0xFFFF
	at org.codehaus.janino.util.ClassFile.addToConstantPool(ClassFile.java:499)
	at org.codehaus.janino.util.ClassFile.addConstantNameAndTypeInfo(ClassFile.java:439)
	at org.codehaus.janino.util.ClassFile.addConstantMethodrefInfo(ClassFile.java:358)
	at org.codehaus.janino.UnitCompiler.writeConstantMethodrefInfo(UnitCompiler.java:11114)
	at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:4547)
	at org.codehaus.janino.UnitCompiler.access$7500(UnitCompiler.java:206)
	at org.codehaus.janino.UnitCompiler$12.visitMethodInvocation(UnitCompiler.java:3774)
	at org.codehaus.janino.UnitCompiler$12.visitMethodInvocation(UnitCompiler.java:3762)
	at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:4328)
	at org.codehaus.janino.UnitCompiler.compileGet(UnitCompiler.java:3762)
	at org.codehaus.janino.UnitCompiler.compileGetValue(UnitCompiler.java:4933)
	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:3180)
	at org.codehaus.janino.UnitCompiler.access$5000(UnitCompiler.java:206)
	at org.codehaus.janino.UnitCompiler$9.visitMethodInvocation(UnitCompiler.java:3151)
	at org.codehaus.janino.UnitCompiler$9.visitMethodInvocation(UnitCompiler.java:3139)
	at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:4328)
	at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:3139)
	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:2112)
	at org.codehaus.janino.UnitCompiler.access$1700(UnitCompiler.java:206)
	at org.codehaus.janino.UnitCompiler$6.visitExpressionStatement(UnitCompiler.java:1377)
	at org.codehaus.janino.UnitCompiler$6.visitExpressionStatement(UnitCompiler.java:1370)
	at org.codehaus.janino.Java$ExpressionStatement.accept(Java.java:2558)
	at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1370)
	at org.codehaus.janino.UnitCompiler.compileStatements(UnitCompiler.java:1450)
	at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:2811)
	at org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1262)
	at org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1234)
	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:538)
	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:890)
	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:894)
	at org.codehaus.janino.UnitCompiler.access$600(UnitCompiler.java:206)
	at org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:377)
	at org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:369)
	at org.codehaus.janino.Java$MemberClassDeclaration.accept(Java.java:1128)
	at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:369)
	at org.codehaus.janino.UnitCompiler.compileDeclaredMemberTypes(UnitCompiler.java:1209)
	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:564)
	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:420)
	at org.codehaus.janino.UnitCompiler.access$400(UnitCompiler.java:206)
	at org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:374)
	at org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:369)
	at org.codehaus.janino.Java$AbstractPackageMemberClassDeclaration.accept(Java.java:1309)
	at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:369)
	at org.codehaus.janino.UnitCompiler.compileUnit(UnitCompiler.java:345)
	at org.codehaus.janino.SimpleCompiler.compileToClassLoader(SimpleCompiler.java:396)
	at org.codehaus.janino.ClassBodyEvaluator.compileToClass(ClassBodyEvaluator.java:311)
	at org.codehaus.janino.ClassBodyEvaluator.cook(ClassBodyEvaluator.java:229)
	at org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:196)
	at org.codehaus.commons.compiler.Cookable.cook(Cookable.java:91)
	at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$CodeGenerator$$doCompile(CodeGenerator.scala:905)
	... 35 more
{code}

During generation of the code for SpecificUnsafeProjection, all the mutable variables are
declared up front. If there are too many, it seems it perhaps exceeds some type of resource
limit.

This issue seems related to (but is not fixed by) SPARK-17702, which itself was about the
size of individual methods growing beyond the 64 KB limit. SPARK-17702 was resolved by breaking
extractions into smaller methods, but does not seem to have resolved this issue.

I've created a small project [1] where I declare a list of "wide" and "nested" Bean objects
that I attempt to encode to a Dataset. This code can trigger the failure for Spark 2.1.0-SNAPSHOT.
And I'll additionally attach the error log that shows the code produced and the stacktrace.

[1] - https://github.com/bdrillard/spark-codegen-error


> Code Generation: Constant Pool Past Limit for Wide/Nested Dataset
> -----------------------------------------------------------------
>
>                 Key: SPARK-18016
>                 URL: https://issues.apache.org/jira/browse/SPARK-18016
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.1.0
>            Reporter: Aleksander Eskilson
>
> When attempting to encode collections of large Java objects to Datasets having very wide
or deeply nested schemas, code generation can fail, yielding:
> {code}
> Caused by: org.codehaus.janino.JaninoRuntimeException: Constant pool for class org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection
has grown past JVM limit of 0xFFFF
> 	at org.codehaus.janino.util.ClassFile.addToConstantPool(ClassFile.java:499)
> 	at org.codehaus.janino.util.ClassFile.addConstantNameAndTypeInfo(ClassFile.java:439)
> 	at org.codehaus.janino.util.ClassFile.addConstantMethodrefInfo(ClassFile.java:358)
> 	at org.codehaus.janino.UnitCompiler.writeConstantMethodrefInfo(UnitCompiler.java:11114)
> 	at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:4547)
> 	at org.codehaus.janino.UnitCompiler.access$7500(UnitCompiler.java:206)
> 	at org.codehaus.janino.UnitCompiler$12.visitMethodInvocation(UnitCompiler.java:3774)
> 	at org.codehaus.janino.UnitCompiler$12.visitMethodInvocation(UnitCompiler.java:3762)
> 	at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:4328)
> 	at org.codehaus.janino.UnitCompiler.compileGet(UnitCompiler.java:3762)
> 	at org.codehaus.janino.UnitCompiler.compileGetValue(UnitCompiler.java:4933)
> 	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:3180)
> 	at org.codehaus.janino.UnitCompiler.access$5000(UnitCompiler.java:206)
> 	at org.codehaus.janino.UnitCompiler$9.visitMethodInvocation(UnitCompiler.java:3151)
> 	at org.codehaus.janino.UnitCompiler$9.visitMethodInvocation(UnitCompiler.java:3139)
> 	at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:4328)
> 	at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:3139)
> 	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:2112)
> 	at org.codehaus.janino.UnitCompiler.access$1700(UnitCompiler.java:206)
> 	at org.codehaus.janino.UnitCompiler$6.visitExpressionStatement(UnitCompiler.java:1377)
> 	at org.codehaus.janino.UnitCompiler$6.visitExpressionStatement(UnitCompiler.java:1370)
> 	at org.codehaus.janino.Java$ExpressionStatement.accept(Java.java:2558)
> 	at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1370)
> 	at org.codehaus.janino.UnitCompiler.compileStatements(UnitCompiler.java:1450)
> 	at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:2811)
> 	at org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1262)
> 	at org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1234)
> 	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:538)
> 	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:890)
> 	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:894)
> 	at org.codehaus.janino.UnitCompiler.access$600(UnitCompiler.java:206)
> 	at org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:377)
> 	at org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:369)
> 	at org.codehaus.janino.Java$MemberClassDeclaration.accept(Java.java:1128)
> 	at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:369)
> 	at org.codehaus.janino.UnitCompiler.compileDeclaredMemberTypes(UnitCompiler.java:1209)
> 	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:564)
> 	at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:420)
> 	at org.codehaus.janino.UnitCompiler.access$400(UnitCompiler.java:206)
> 	at org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:374)
> 	at org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:369)
> 	at org.codehaus.janino.Java$AbstractPackageMemberClassDeclaration.accept(Java.java:1309)
> 	at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:369)
> 	at org.codehaus.janino.UnitCompiler.compileUnit(UnitCompiler.java:345)
> 	at org.codehaus.janino.SimpleCompiler.compileToClassLoader(SimpleCompiler.java:396)
> 	at org.codehaus.janino.ClassBodyEvaluator.compileToClass(ClassBodyEvaluator.java:311)
> 	at org.codehaus.janino.ClassBodyEvaluator.cook(ClassBodyEvaluator.java:229)
> 	at org.codehaus.janino.SimpleCompiler.cook(SimpleCompiler.java:196)
> 	at org.codehaus.commons.compiler.Cookable.cook(Cookable.java:91)
> 	at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.org$apache$spark$sql$catalyst$expressions$codegen$CodeGenerator$$doCompile(CodeGenerator.scala:905)
> 	... 35 more
> {code}
> During generation of the code for SpecificUnsafeProjection, all the mutable variables
are declared up front. If there are too many, it seems it perhaps exceeds some type of resource
limit.
> This issue seems related to (but is not fixed by) SPARK-17702, which itself was about
the size of individual methods growing beyond the 64 KB limit. SPARK-17702 was resolved by
breaking extractions into smaller methods, but does not seem to have resolved this issue.
> I've created a small project [1] where I declare a list of "wide" and "nested" Bean objects
that I attempt to encode to a Dataset. This code can trigger the failure for Spark 2.1.0-SNAPSHOT.

> [1] - https://github.com/bdrillard/spark-codegen-error



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message