spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koert Kuipers <ko...@tresata.com>
Subject Re: File present but file not found exception
Date Mon, 19 May 2014 13:50:35 GMT
why does it need to be local file? why not do some filter ops on hdfs file
and save to hdfs, from where you can create rdd?

you can read a small file in on driver program and use sc.parallelize to
turn it into RDD
On May 16, 2014 7:01 PM, "Sai Prasanna" <ansaiprasanna@gmail.com> wrote:

> I found that if a file is present in all the nodes in the given path in
> localFS, then reading is possible.
>
> But is there a way to read if the file is present only in certain nodes ??
> [There should be a way !!]
>
> *NEED: Wanted to do some filter ops in HDFS file, create a local file of
> the result, create an RDD out of it operate *
>
> Is there any way out ??
>
> Thanks in advance !
>
>
>
>
> On Fri, May 9, 2014 at 12:18 AM, Sai Prasanna <ansaiprasanna@gmail.com>wrote:
>
>> Hi Everyone,
>>
>> I think all are pretty busy, the response time in this group has slightly
>> increased.
>>
>> But anyways, this is a pretty silly problem, but could not get over.
>>
>> I have a file in my localFS, but when i try to create an RDD out of it,
>> tasks fails with file not found exception is thrown at the log files.
>>
>> *var file = sc.textFile("file:///home/sparkcluster/spark/input.txt");*
>> *file.top(1);*
>>
>> input.txt exists in the above folder but still Spark coudnt find it. Some
>> parameters need to be set ??
>>
>> Any help is really appreciated. Thanks !!
>>
>
>

Mime
View raw message