spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From haridass saisriram <haridass.saisri...@gmail.com>
Subject SparkSQL: Reading data from hdfs and storing into multiple paths
Date Thu, 01 Oct 2015 21:11:08 GMT
Hi,

  I am trying to find a simple example to read a data file on HDFS. The
file has the following format
a , b  , c ,yyyy,mm
a1,b1,c1,2015,09
a2,b2,c2,2014,08


I would like to read this file and store it in HDFS partitioned by year and
month. Something like this
/path/to/hdfs/yyyy/mm

I want to specify the "/path/to/hdfs/" and yyyy/mm should be populated
automatically based on those columns. Could some one point me in the right
direction

Thank you,
Sri Ram

Mime
View raw message