spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mich Talebzadeh <mich.talebza...@gmail.com>
Subject Saving parquet table as uncompressed with write.mode("overwrite").
Date Sun, 03 Jul 2016 21:42:22 GMT
Hi,

I simply read a Parquet table

scala> val s = sqlContext.read.parquet("oraclehadoop.sales2")
s: org.apache.spark.sql.DataFrame = [prod_id: bigint, cust_id: bigint,
time_id: timestamp, channel_id: bigint, promo_id: bigint, quantity_sold:
decimal(10,0), amount_sold: decimal(10,0)]

Now all I want is to save data and make it uncompressed. By default it
saves the table as *gzipped*

val s4 = s.write.mode("overwrite").parquet("/user/hduser/sales4")

However, I want use this approach without creating table explicitly myself
with sqlContext etc

This does not seem to work

val c = sqlContext.setConf("spark.sql.parquet.compression.codec.",
"uncompressed")

Can I do through a method on DataFrame "s" above to make the table saved as
uncompressed?

Thanks,

Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.

Mime
View raw message