spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Liana Napalkova <liana.napalk...@eurecat.org>
Subject Re: py4j.protocol.Py4JJavaError: An error occurred while calling o794.parquet
Date Wed, 10 Jan 2018 20:04:00 GMT
The DataFrame is not empy.
Indeed, it has nothing to do with serialization. I think that the issue is related to this
bug: https://issues.apache.org/jira/browse/SPARK-22769
In my question I have not posted the whole error stack trace, but one of the error messages
says `Could not find CoarseGrainedScheduler`. So, it's probably something related to the resources.

________________________________
From: Timur Shenkao <tsh@timshenkao.su>
Sent: 10 January 2018 20:07:37
To: Liana Napalkova
Cc: user@spark.apache.org
Subject: Re: py4j.protocol.Py4JJavaError: An error occurred while calling o794.parquet


Caused by: org.apache.spark.SparkException: Task not serializable


That's the answer :)

What are you trying to save? Is it empty or None / null?

On Wed, Jan 10, 2018 at 4:58 PM, Liana Napalkova <liana.napalkova@eurecat.org<mailto:liana.napalkova@eurecat.org>>
wrote:

Hello,

Has anybody faced the following problem in PySpark? (Python 2.7.12):

    df.show() # works fine and shows the first 5 rows of DataFrame

    df.write.parquet(outputPath + '/data.parquet', mode="overwrite")  # throws the error

The last line throws the following error:


py4j.protocol.Py4JJavaError: An error occurred while calling o794.parquet.
: org.apache.spark.SparkException: Job aborted.
        at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply$mcV$sp(FileFormatWriter.scala:215)
        at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:173)
        at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:173)
        at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:65)
        at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:173)

Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult:
        at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)
        at org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.doExecuteBroadcast(BroadcastExchangeExec.scala:123)
        at org.apache.spark.sql.execution.InputAdapter.doExecuteBroadcast(WholeStageCodegenExec.scala:248)
        at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:127)
        at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeBroadcast$1.apply(SparkPlan.scala:127)
        at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)

Caused by: org.apache.spark.SparkException: Task not serializable
        at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:298)
        at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288)
        at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:108)
        at org.apache.spark.SparkContext.clean(SparkContext.scala:2287)
        at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:794)
        at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:793)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)

Caused by: java.lang.IllegalArgumentException
        at java.nio.Buffer.position(Buffer.java:244)
        at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:153)
        at java.nio.ByteBuffer.get(ByteBuffer.java:715)

Caused by: java.nio.BufferUnderflowException

        at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:151)
        at java.nio.ByteBuffer.get(ByteBuffer.java:715)
        at org.apache.parquet.io.api.Binary$ByteBufferBackedBinary.getBytes(Binary.java:405)
        at org.apache.parquet.io.api.Binary$ByteBufferBackedBinary.getBytesUnsafe(Binary.java:414)
        at org.apache.parquet.io.api.Binary$ByteBufferBackedBinary.writeObject(Binary.java:484)
        at sun.reflect.GeneratedMethodAccessor48.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)

Thanks.

L.

________________________________
DISCLAIMER: Aquest missatge pot contenir informació confidencial. Si vostè no n'és el destinatari,
si us plau, esborri'l i faci'ns-ho saber immediatament a la següent adreça: legal@eurecat.org<mailto:legal@eurecat.org>
Si el destinatari d'aquest missatge no consent la utilització del correu electrònic via
Internet i la gravació de missatges, li preguem que ens ho comuniqui immediatament.

DISCLAIMER: Este mensaje puede contener información confidencial. Si usted no es el destinatario
del mensaje, por favor bórrelo y notifíquenoslo inmediatamente a la siguiente dirección:
legal@eurecat.org<mailto:legal@eurecat.org> Si el destinatario de este mensaje no consintiera
la utilización del correo electrónico vía Internet y la grabación de los mensajes, rogamos
lo ponga en nuestro conocimiento de forma inmediata.

DISCLAIMER: Privileged/Confidential Information may be contained in this message. If you are
not the addressee indicated in this message you should destroy this message, and notify us
immediately to the following address: legal@eurecat.org<mailto:legal@eurecat.org>. If
the addressee of this message does not consent to the use of Internet e-mail and message recording,
please notify us immediately.
________________________________



________________________________
DISCLAIMER: Aquest missatge pot contenir informació confidencial. Si vostè no n'és el destinatari,
si us plau, esborri'l i faci'ns-ho saber immediatament a la següent adreça: legal@eurecat.org
Si el destinatari d'aquest missatge no consent la utilització del correu electrònic via
Internet i la gravació de missatges, li preguem que ens ho comuniqui immediatament.

DISCLAIMER: Este mensaje puede contener información confidencial. Si usted no es el destinatario
del mensaje, por favor bórrelo y notifíquenoslo inmediatamente a la siguiente dirección:
legal@eurecat.org Si el destinatario de este mensaje no consintiera la utilización del correo
electrónico vía Internet y la grabación de los mensajes, rogamos lo ponga en nuestro conocimiento
de forma inmediata.

DISCLAIMER: Privileged/Confidential Information may be contained in this message. If you are
not the addressee indicated in this message you should destroy this message, and notify us
immediately to the following address: legal@eurecat.org. If the addressee of this message
does not consent to the use of Internet e-mail and message recording, please notify us immediately.
________________________________



Mime
View raw message