spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sunny Khatri <sunny.k...@gmail.com>
Subject Re: Fwd: Spark SQL: ArrayIndexOutofBoundsException
Date Thu, 02 Oct 2014 23:06:07 GMT
You can do filter with startswith ?

On Thu, Oct 2, 2014 at 4:04 PM, SK <skrishna.id@gmail.com> wrote:

> Thanks for the help. Yes, I did not realize that the first header line has
> a
> different separator.
>
> By the way, is there a way to drop the first line that contains the header?
> Something along the following lines:
>
>       sc.textFile(inp_file)
>           .drop(1)  // or tail() to drop the header line
>           .map....  // rest of the processing
>
> I could not find a drop() function or take the bottom (n) elements for RDD.
> Alternatively, a way to create the case class schema from the header line
> of
> the file  and use the rest for the data would be useful - just as a
> suggestion.  Currently I am just deleting this header line manually before
> processing it in Spark.
>
>
> thanks
>
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-ArrayIndexOutofBoundsException-tp15639p15642.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Mime
View raw message