Hi Tejas,

I have no clue on this one I really don't.
I've posted over on user@hadoop... hopefully something will offer itself forth to us.
In the meantime, thanks for your interest.

On Mon, Jan 7, 2013 at 9:32 PM, Tejas Patil <tejas.patil.cs@gmail.com> wrote:
Hi Lewis,

I feel that this issue might be surrounding "/etc/hosts" file. In [0], Dennis Kubes suggested some change to the hosts file. (same thing was mentioned in article [1]). In [2], the suggested to check if ssh works using hostname and ip. 

[0] : http://lucene.472066.n3.nabble.com/Nutch-Crawling-error-td612107.html
[1] : http://www.thegeekstuff.com/2012/02/hadoop-standalone-installation/
[2] : http://www.mail-archive.com/user@cassandra.apache.org/msg16668.html

Tejas Patil

On Mon, Jan 7, 2013 at 7:44 PM, Lewis John Mcgibbney <lewis.mcgibbney@gmail.com> wrote:
Hi Tejas,

The Jenkins seems to have had a reboot (or something of this nature) around Christmas. I need to be honest and say that I don't know the source of the problem.
Saying that, Hadoop (and other technologies) can also be a funny bugger sometimes when it comes to security, proxy, inet addresses. We've witnessed this a good few times over in Gora where HBase/Hadoop servers fail to initiate due to intermittent proxy settings/problems... I don't know enough about the code to provide a definitive answer.

For the time being, I am happy to progress with integrating tests and minor tickets into Nutch, however I think we really ought to sort the source or this stack trace regardless of the fact it is on Jenkins.

Maybe we should head over to hadoop general?

I thought I would let this thread stew a while before pouncing on it again.


On Mon, Jan 7, 2013 at 7:22 PM, Tejas Patil <tejas.patil.cs@gmail.com> wrote:
Hi Lewis,

These test cases pass on my machine (i guess on yours' too). Had it been related to Hadoop API then tests must fail everywhere. What is different about the setup where the nightly builds are executed ?

Tejas Patil

On Mon, Jan 7, 2013 at 3:24 PM, Lewis John Mcgibbney <lewis.mcgibbney@gmail.com> wrote:
Hi All,

An update to this issue then...

The failing tests indicate an addition (security?) feature which makes the tests fail on Jenkins as it results in the following stack traces for the following tests


Same every time.
Currently I am not sure how we can work around this, however I suspect that we need to use some other aspect of the Hadoop API in all calls to obtain socket addresses from servers.
Any ideas?

java.net.UnknownHostException: -s: -s
	at java.net.InetAddress.getLocalHost(InetAddress.java:1354)
	at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
	at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:912)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
	at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:912)
	at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:886)
	at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1323)
	at org.apache.nutch.crawl.Generator.generate(Generator.java:551)
	at org.apache.nutch.crawl.Generator.generate(Generator.java:465)
	at org.apache.nutch.crawl.TestGenerator.generateFetchlist(TestGenerator.java:313)
	at org.apache.nutch.crawl.TestGenerator.testFilter(TestGenerator.java:259)