spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ehbhaskar <ehbhas...@gmail.com>
Subject [Spark SQL] Couldn't save dataframe with null columns to S3.
Date Tue, 06 Nov 2018 01:02:10 GMT
I have a spark job that writes data to S3 as below.
source_data_df_to_write.select(target_columns_list) \
.write.partitionBy(target_partition_cols_list) \
.format("ORC").save(self.table_location_prefix + self.target_table,
mode="append")

My dataframe some times can have null values for columns. Writing dataframe
with null attributes fails my job stating IllegalArgumentException as below.
Caused by: java.lang.*IllegalArgumentException: Error: type expected at the
position 14 of
*'double:string:null:string:string:string:double:bigint:null:null:null:null:string:null:string:null:null:null:null:string:string:string:null:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:null:null:null:null:null:null:null:null:null:null:null:null:null:null:null:null:null:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string:string'*
but 'null' is found*.


Sample dataframe looks like this:
columns_with_default = "col1, NULL as col2, col2, col4, NULL as col5,
partition_col1, partition_col2"
source_data_df_to_write = self.session.sql(
                 "SELECT %s FROM TEMP_VIEW" % (columns_with_default))

So, is there a way to make spark job to write dataframe with NULL attributes
to S3? 



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message