spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Akhil Das <ak...@sigmoidanalytics.com>
Subject Re: skipping header from each file
Date Fri, 09 Jan 2015 06:48:06 GMT
Did you try something like:

    val file = sc.textFile("/home/akhld/sigmoid/input")

    val skipped = file.filter(row => !row.contains("header"))

    skipped.take(10).foreach(println)

Thanks
Best Regards

On Fri, Jan 9, 2015 at 11:48 AM, Hafiz Mujadid <hafizmujadid00@gmail.com>
wrote:

> Suppose I give three files paths to spark context to read and each file has
> schema in first row. how can we skip schema lines from headers
>
>
> val rdd=sc.textFile("file1,file2,file3");
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/skipping-header-from-each-file-tp21051.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Mime
View raw message