spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Somnath Pandeya <>
Subject RE: skipping header from each file
Date Fri, 09 Jan 2015 08:59:19 GMT
May be you can use wholeTextFiles method, which returns filename and content of the file as
PariRDD and ,then you can remove the first line from files. 

-----Original Message-----
From: Hafiz Mujadid [] 
Sent: Friday, January 09, 2015 11:48 AM
Subject: skipping header from each file

Suppose I give three files paths to spark context to read and each file has schema in first
row. how can we skip schema lines from headers

val rdd=sc.textFile("file1,file2,file3");

View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail: For additional commands, e-mail:

**************** CAUTION - Disclaimer *****************
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely 
for the use of the addressee(s). If you are not the intended recipient, please 
notify the sender by e-mail and delete the original message. Further, you are not 
to copy, disclose, or distribute this e-mail or its contents to any other person and 
any such actions are unlawful. This e-mail may contain viruses. Infosys has taken 
every reasonable precaution to minimize this risk, but is not liable for any damage 
you may sustain as a result of any virus in this e-mail. You should carry out your 
own virus checks before opening the e-mail or attachment. Infosys reserves the 
right to monitor and review the content of all messages sent to or from this e-mail 
address. Messages sent to or from this e-mail address may be stored on the 
Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message