spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Saving parquet table as uncompressed with write.mode("overwrite").
Date Sun, 03 Jul 2016 22:21:30 GMT
Have you tried the following (note the extraneous dot in your config name) ?

val c = sqlContext.setConf("spark.sql.parquet.compression.codec", "none")

Also, parquet() has compression parameter which defaults to None

FYI

On Sun, Jul 3, 2016 at 2:42 PM, Mich Talebzadeh <mich.talebzadeh@gmail.com>
wrote:

> Hi,
>
> I simply read a Parquet table
>
> scala> val s = sqlContext.read.parquet("oraclehadoop.sales2")
> s: org.apache.spark.sql.DataFrame = [prod_id: bigint, cust_id: bigint,
> time_id: timestamp, channel_id: bigint, promo_id: bigint, quantity_sold:
> decimal(10,0), amount_sold: decimal(10,0)]
>
> Now all I want is to save data and make it uncompressed. By default it
> saves the table as *gzipped*
>
> val s4 = s.write.mode("overwrite").parquet("/user/hduser/sales4")
>
> However, I want use this approach without creating table explicitly myself
> with sqlContext etc
>
> This does not seem to work
>
> val c = sqlContext.setConf("spark.sql.parquet.compression.codec.",
> "uncompressed")
>
> Can I do through a method on DataFrame "s" above to make the table saved
> as uncompressed?
>
> Thanks,
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>

Mime
View raw message