Hi,

Naga has kindly suggested here that I should push the file into RDD and get rid of header. But my partitions have hundreds of files in it and just opening and processing the files using RDD is a way old method of working. I think that SPARK community has moved on from RDD, to Dataframes to Datasets now. 

I know for special cases we still need RDD, but for a CSV file in case we are asked to use RDD in order to just avoid the header then it does not sound quite right for me.



Regards,
Gourav Sengupta

On Fri, Sep 8, 2017 at 7:25 PM, Gourav Sengupta <gourav.sengupta@gmail.com> wrote:
Hi,

According to this thread https://issues.apache.org/jira/browse/SPARK-11374. SPARK will not resolve the issue of skipping header option when the table is defined in HIVE.

But I am unable to see a SPARK SQL option for setting up external partitioned table.

Does that mean in case I have to create an external partitioned table I must use HIVE and when I use HIVE SPARK does not allow me to ignore the headers?


Regards,
Gourav Sengupta