spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Armbrust <mich...@databricks.com>
Subject Re: SparkSQL API to insert DataFrame into a static partition?
Date Wed, 02 Dec 2015 18:30:36 GMT
you might also coalesce to 1 (or some small number) before writing to avoid
creating a lot of files in that partition if you know that there is not a
ton of data.

On Wed, Dec 2, 2015 at 12:59 AM, Rishi Mishra <rmishra@snappydata.io> wrote:

> As long as all your data is being inserted by Spark , hence using the same
> hash partitioner,  what Fengdong mentioned should work.
>
> On Wed, Dec 2, 2015 at 9:32 AM, Fengdong Yu <fengdongy@everstring.com>
> wrote:
>
>> Hi
>> you can try:
>>
>> if your table under location “/test/table/“ on HDFS
>> and has partitions:
>>
>>  “/test/table/dt=2012”
>>  “/test/table/dt=2013”
>>
>> df.write.mode(SaveMode.Append).partitionBy("date”).save(“/test/table")
>>
>>
>>
>> On Dec 2, 2015, at 10:50 AM, Isabelle Phan <nliphan@gmail.com> wrote:
>>
>> df.write.partitionBy("date").insertInto("my_table")
>>
>>
>>
>
>
> --
> Regards,
> Rishitesh Mishra,
> SnappyData . (http://www.snappydata.io/)
>
> https://in.linkedin.com/in/rishiteshmishra
>

Mime
View raw message