spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Lenderman <jslender...@gmail.com>
Subject Re: how to avoid reading the first line of dataframe?
Date Wed, 25 Sep 2013 03:54:22 GMT
Perhaps you could use mapPartitionsWithIndex to do this.


On Tue, Sep 24, 2013 at 4:52 PM, Michael Kun Yang <kunyang@stanford.edu>wrote:

> Spark's filter can do this job, but it need to scan very line (row). Is
> there a way to just skip the first line in the file?
>
> any feedback?
>
>
> On Tue, Sep 24, 2013 at 4:14 PM, Michael Kun Yang <kunyang@stanford.edu>wrote:
>
>> Dataframes usually have headers in the first row, how can I avoid reading
>> the first row?
>> I know in hadoop, I can figure it out by the line number.
>>
>> Best
>>
>
>

Mime
View raw message