spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohamed Nadjib MAMI <>
Subject Java.util.ArrayList is not a valid external type for schema of array<string>
Date Thu, 13 Oct 2016 22:30:45 GMT
In Spark 1.5.2 I had a job that reads from textFile and saves some data
into a Parquet table. One value was of type `ArrayList<String>` being
successfully saved as an "array<string>" column in the Parquet table. I
upgraded to Spark version 2.0.1, I changed the necessary code (SparkConf to
SparkSession,  DataFrame to Dataset) so no syntactic issues in the code.
However, the job is not finishing anymore. The following exception is fired:
`java.lang.RuntimeException: Error while encoding:
java.lang.RuntimeException: java.util.ArrayList is not a valid external
type for schema of array<string>`

at the line:

I inspected the schema and it looked fine. Here is the string array column:


...and the value to be saved therein looks like:

[aaa, bbb, ccc]

The column array<string> is constructed this way:
true), true);`

I guess I provided all necessary code, but if more helps, please let me

So there should be some logic-change in the latest version altering the
possibility to save ArrayList<String> in an array of string in Parquet
tables. Any help on solving/working around this would be very appreciated.

*Regards, Grüße, **Cordialement,** Recuerdos, Saluti, προσρήσεις, 问候,
*Mohamed Nadjib Mami*
*PhD Student - EIS Department - **Bonn University (Germany).*

*About me! <>*
*LinkedIn <>*

View raw message