spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Timur Shenkao <...@timshenkao.su>
Subject Re: py4j.protocol.Py4JJavaError: An error occurred while calling o794.parquet
Date Wed, 10 Jan 2018 19:07:37 GMT
Caused by: org.apache.spark.SparkException: Task not serializable

That's the answer :)

What are you trying to save? Is it empty or None / null?


On Wed, Jan 10, 2018 at 4:58 PM, Liana Napalkova <
liana.napalkova@eurecat.org> wrote:

> Hello,
>
>
> Has anybody faced the following problem in PySpark? (Python 2.7.12):
>
>     df.show() # works fine and shows the first 5 rows of DataFrame
>
>     df.write.parquet(outputPath + '/data.parquet', mode="overwrite")  #
> throws the error
>
> The last line throws the following error:
>
> py4j.protocol.Py4JJavaError: An error occurred while calling o794.parquet.
> : org.apache.spark.SparkException: Job aborted.
> 	at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply$mcV$sp(FileFormatWriter.scala:215)
> 	at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:173)
> 	at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:173)
> 	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:65)
> 	at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:173)
>
> Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult:
> 	at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)
> 	at org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.doExecuteBroadcast(BroadcastExchangeExec.scala:123)
> 	at org.apache.spark.sql.execution.InputAdapter.doExecuteBroadcast(WholeStageCodegenExec.scala:248)
> 	at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:127)
> 	at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:127)
> 	at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
>
> Caused by: org.apache.spark.SparkException: Task not serializable
> 	at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:298)
> 	at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288)
> 	at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:108)
> 	at org.apache.spark.SparkContext.clean(SparkContext.scala:2287)
> 	at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:794)
> 	at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:793)
> 	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)	at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
>
> Caused by: java.lang.IllegalArgumentException
>         at java.nio.Buffer.position(Buffer.java:244)
>         at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:153)
>         at java.nio.ByteBuffer.get(ByteBuffer.java:715)
>
> Caused by: java.nio.BufferUnderflowException
>
> 	at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:151)
> 	at java.nio.ByteBuffer.get(ByteBuffer.java:715)
> 	at org.apache.parquet.io.api.Binary$ByteBufferBackedBinary.getBytes(Binary.java:405)
> 	at org.apache.parquet.io.api.Binary$ByteBufferBackedBinary.getBytesUnsafe(Binary.java:414)
> 	at org.apache.parquet.io.api.Binary$ByteBufferBackedBinary.writeObject(Binary.java:484)
> 	at sun.reflect.GeneratedMethodAccessor48.invoke(Unknown Source)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
>
> Thanks.
>
> L.
>
> ------------------------------
> DISCLAIMER: Aquest missatge pot contenir informació confidencial. Si vostè
> no n'és el destinatari, si us plau, esborri'l i faci'ns-ho saber
> immediatament a la següent adreça: legal@eurecat.org Si el destinatari
> d'aquest missatge no consent la utilització del correu electrònic via
> Internet i la gravació de missatges, li preguem que ens ho comuniqui
> immediatament.
>
> DISCLAIMER: Este mensaje puede contener información confidencial. Si usted
> no es el destinatario del mensaje, por favor bórrelo y notifíquenoslo
> inmediatamente a la siguiente dirección: legal@eurecat.org Si el
> destinatario de este mensaje no consintiera la utilización del correo
> electrónico vía Internet y la grabación de los mensajes, rogamos lo ponga
> en nuestro conocimiento de forma inmediata.
>
> DISCLAIMER: Privileged/Confidential Information may be contained in this
> message. If you are not the addressee indicated in this message you should
> destroy this message, and notify us immediately to the following address:
> legal@eurecat.org. If the addressee of this message does not consent to
> the use of Internet e-mail and message recording, please notify us
> immediately.
> ------------------------------
>
>
>

Mime
View raw message