spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dino Fancellu <d...@felstar.com>
Subject Re: Local Spark talking to remote HDFS?
Date Mon, 24 Aug 2015 20:10:39 GMT
Changing the ip to the guest IP address just never connects.

The VM has port tunnelling, and it passes through all the main ports,
8020 included to the host VM.

You can tell that it was talking to the guest VM before, simply
because it said when file not found

Error is:

Exception in thread "main" org.apache.spark.SparkException: Job
aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most
recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost):
org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block:
BP-452094660-10.0.2.15-1437494483194:blk_1073742905_2098
file=/tmp/people.txt

but I have no idea what it means by that. It certainly can find the
file and knows it exists.



On 24 August 2015 at 20:43, Roberto Congiu <roberto.congiu@gmail.com> wrote:
> When you launch your HDP guest VM, most likely it gets launched with NAT and
> an address on a private network (192.168.x.x) so on your windows host you
> should use that address (you can find out using ifconfig on the guest OS).
> I usually add an entry to my /etc/hosts for VMs that I use often....if you
> use vagrant, there's also a vagrant module that can do that automatically.
> Also, I am not sure how the default HDP VM is set up, that is, if it only
> binds HDFS to 127.0.0.1 or to all addresses. You can check that with netstat
> -a.
>
> R.
>
> 2015-08-24 11:46 GMT-07:00 Dino Fancellu <dino@felstar.com>:
>>
>> I have a file in HDFS inside my HortonWorks HDP 2.3_1 VirtualBox VM.
>>
>> If I go into the guest spark-shell and refer to the file thus, it works
>> fine
>>
>>   val words=sc.textFile("hdfs:///tmp/people.txt")
>>   words.count
>>
>> However if I try to access it from a local Spark app on my Windows host,
>> it
>> doesn't work
>>
>>   val conf = new SparkConf().setMaster("local").setAppName("My App")
>>   val sc = new SparkContext(conf)
>>
>>   val words=sc.textFile("hdfs://localhost:8020/tmp/people.txt")
>>   words.count
>>
>> Emits
>>
>>
>>
>> The port 8020 is open, and if I choose the wrong file name, it will tell
>> me
>>
>>
>>
>> My pom has
>>
>>         <dependency>
>>                         <groupId>org.apache.spark</groupId>
>>                         <artifactId>spark-core_2.11</artifactId>
>>                         <version>1.4.1</version>
>>                         <scope>provided</scope>
>>                 </dependency>
>>
>> Am I doing something wrong?
>>
>> Thanks.
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Local-Spark-talking-to-remote-HDFS-tp24425.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message