I simply read a Parquet table

scala> val s = sqlContext.read.parquet("oraclehadoop.sales2")
s: org.apache.spark.sql.DataFrame = [prod_id: bigint, cust_id: bigint, time_id: timestamp, channel_id: bigint, promo_id: bigint, quantity_sold: decimal(10,0), amount_sold: decimal(10,0)]

Now all I want is to save data and make it uncompressed. By default it saves the table as gzipped

val s4 = s.write.mode("overwrite").parquet("/user/hduser/sales4")

However, I want use this approach without creating table explicitly myself with sqlContext etc

This does not seem to work

val c = sqlContext.setConf("spark.sql.parquet.compression.codec.", "uncompressed")

Can I do through a method on DataFrame "s" above to make the table saved as uncompressed?


Dr Mich Talebzadeh


LinkedIn  https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.