whirr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrei Savu <savu.and...@gmail.com>
Subject Re: Bad connection to FS. command aborted.
Date Thu, 23 Feb 2012 22:14:07 GMT
See the last 3 comments on
https://issues.apache.org/jira/browse/WHIRR-490- I will commit a fix
now both to branch 0.7 & trunk. Sorry for
any inconvenience. I am happy we found this problem now before building the
release candidate.

Thanks!

On Thu, Feb 23, 2012 at 9:07 PM, Andrei Savu <savu.andrei@gmail.com> wrote:

> This the branch for 0.7.1 RC0 and all tests are working as expected:
> https://svn.apache.org/repos/asf/whirr/branches/branch-0.7
>
> Can you give it a try? I'm still checking mapred.child.ulimit
>
>
> On Thu, Feb 23, 2012 at 9:05 PM, Edmar Ferreira <
> edmaroliveiraferreira@gmail.com> wrote:
>
>> just some changes in install_hadoop.sh to install ruby and some
>> dependencies.
>> I'm running whirr from trunk and I build it 5 days ago, I guess.
>> Do you think I need to do a svn checkout and build it again ?
>>
>>
>> On Thu, Feb 23, 2012 at 6:53 PM, Andrei Savu <savu.andrei@gmail.com>wrote:
>>
>>> It's strange this is happening because the integration tests work as
>>> expected (we actually running MR jobs).
>>>
>>> Are you adding any other options?
>>>
>>>
>>> On Thu, Feb 23, 2012 at 8:50 PM, Andrei Savu <savu.andrei@gmail.com>wrote:
>>>
>>>> That looks like a change we've made in
>>>> https://issues.apache.org/jira/browse/WHIRR-490
>>>>
>>>> It seems like "unlimited" is not a valid value for mapred.child.ulimit.
>>>> Let me investigate a bit more.
>>>>
>>>> In  the meantime you can add to your .properties file something like:
>>>>
>>>> hadoop-mapreduce.mapred.child.ulimit=<very-large-number>
>>>>
>>>>
>>>> On Thu, Feb 23, 2012 at 8:36 PM, Edmar Ferreira <
>>>> edmaroliveiraferreira@gmail.com> wrote:
>>>>
>>>>> changed it and the cluster is running and I can access the fs and
>>>>> submit jobs, but all jobs aways fail with this strange error:
>>>>>
>>>>> java.lang.NumberFormatException: For input string: "unlimited"
>>>>> 	at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>>>>> 	at java.lang.Integer.parseInt(Integer.java:481)
>>>>> 	at java.lang.Integer.valueOf(Integer.java:570)
>>>>> 	at org.apache.hadoop.util.Shell.getUlimitMemoryCommand(Shell.java:86)
>>>>> 	at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:379)
>>>>>
>>>>>
>>>>>
>>>>> Also when I try to access the full error log I see this in the browser:
>>>>>
>>>>> HTTP ERROR: 410
>>>>>
>>>>> Failed to retrieve stdout log for task: attempt_201202232026_0001_m_000005_0
>>>>>
>>>>> RequestURI=/tasklog
>>>>>
>>>>>
>>>>> My proxy is running and I'm using the socks proxy in localhost 6666
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Feb 23, 2012 at 5:25 PM, Andrei Savu <savu.andrei@gmail.com>wrote:
>>>>>
>>>>>> That should work but I recommend you to try:
>>>>>>
>>>>>>
>>>>>> http://apache.osuosl.org/hadoop/common/hadoop-0.20.2/hadoop-0.20.2.tar.gz
>>>>>>
>>>>>> archive.apache.org  is extremely unreliable.
>>>>>>
>>>>>>
>>>>>> On Thu, Feb 23, 2012 at 7:18 PM, Edmar Ferreira <
>>>>>> edmaroliveiraferreira@gmail.com> wrote:
>>>>>>
>>>>>>> I will destroy this cluster and launch again with these lines
in the
>>>>>>> properties:
>>>>>>>
>>>>>>>
>>>>>>> whirr.hadoop.version=0.20.2
>>>>>>> whirr.hadoop.tarball.url=
>>>>>>> http://archive.apache.org/dist/hadoop/core/hadoop-${whirr.hadoop.version}/hadoop-${whirr.hadoop.version}.tar.gz
>>>>>>>
>>>>>>>
>>>>>>> Any other ideas ?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Feb 23, 2012 at 5:16 PM, Andrei Savu <savu.andrei@gmail.com>wrote:
>>>>>>>
>>>>>>>> Yep, so I think this is the root cause. I'm pretty sure that
you
>>>>>>>> need to make sure you are running the same version.
>>>>>>>>
>>>>>>>> On Thu, Feb 23, 2012 at 7:14 PM, Edmar Ferreira <
>>>>>>>> edmaroliveiraferreira@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> When I run :
>>>>>>>>>
>>>>>>>>> hadoop version in one cluster machine I get
>>>>>>>>>
>>>>>>>>> Warning: $HADOOP_HOME is deprecated.
>>>>>>>>>
>>>>>>>>> Hadoop 0.20.205.0
>>>>>>>>> Subversion
>>>>>>>>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-security-205-r
1179940
>>>>>>>>> Compiled by hortonfo on Fri Oct  7 06:20:32 UTC 2011
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> When I run hadoop version in my local machine I get
>>>>>>>>>
>>>>>>>>> Hadoop 0.20.2
>>>>>>>>> Subversion
>>>>>>>>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-r
911707
>>>>>>>>> Compiled by chrisdo on Fri Feb 19 08:07:34 UTC 2010
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Feb 23, 2012 at 5:05 PM, Andrei Savu <
>>>>>>>>> savu.andrei@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Do the local Hadoop version match the remote one?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Feb 23, 2012 at 7:00 PM, Edmar Ferreira <
>>>>>>>>>> edmaroliveiraferreira@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Yes, I did a
>>>>>>>>>>>
>>>>>>>>>>> export HADOOP_CONF_DIR=~/.whirr/hadoop/
>>>>>>>>>>>
>>>>>>>>>>> before running hadoop fs -ls
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Feb 23, 2012 at 4:56 PM, Ashish <paliwalashish@gmail.com
>>>>>>>>>>> > wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Did you set the HADOOP_CONF_DIR=~/.whirr/<you
cluster name>
>>>>>>>>>>>> from the
>>>>>>>>>>>> shell where you are running the hadoop command?
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Feb 24, 2012 at 12:23 AM, Andrei
Savu <
>>>>>>>>>>>> savu.andrei@gmail.com> wrote:
>>>>>>>>>>>> > That looks fine.
>>>>>>>>>>>> >
>>>>>>>>>>>> > Anything interesting in the Hadoop logs
on the remote
>>>>>>>>>>>> machines? Are all the
>>>>>>>>>>>> > daemons running as expected?
>>>>>>>>>>>> >
>>>>>>>>>>>> > On Thu, Feb 23, 2012 at 6:48 PM, Edmar
Ferreira
>>>>>>>>>>>> > <edmaroliveiraferreira@gmail.com>
wrote:
>>>>>>>>>>>> >>
>>>>>>>>>>>> >> last lines
>>>>>>>>>>>> >>
>>>>>>>>>>>> >>
>>>>>>>>>>>> >> 2012-02-23 16:04:30,241 INFO
>>>>>>>>>>>> >>  [org.apache.whirr.actions.ScriptBasedClusterAction]
(main)
>>>>>>>>>>>> Finished running
>>>>>>>>>>>> >> configure phase scripts on all cluster
instances
>>>>>>>>>>>> >> 2012-02-23 16:04:30,241 INFO
>>>>>>>>>>>> >>
>>>>>>>>>>>>  [org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler]
(main)
>>>>>>>>>>>> >> Completed configuration of hadoop
role hadoop-namenode
>>>>>>>>>>>> >> 2012-02-23 16:04:30,241 INFO
>>>>>>>>>>>> >>
>>>>>>>>>>>>  [org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler]
(main)
>>>>>>>>>>>> >> Namenode web UI available at
>>>>>>>>>>>> >> http://ec2-23-20-110-12.compute-1.amazonaws.com:50070
>>>>>>>>>>>> >> 2012-02-23 16:04:30,242 INFO
>>>>>>>>>>>> >>
>>>>>>>>>>>>  [org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler]
(main)
>>>>>>>>>>>> >> Wrote Hadoop site file
>>>>>>>>>>>> >> /Users/edmaroliveiraferreira/.whirr/hadoop/hadoop-site.xml
>>>>>>>>>>>> >> 2012-02-23 16:04:30,246 INFO
>>>>>>>>>>>> >>
>>>>>>>>>>>>  [org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler]
(main)
>>>>>>>>>>>> >> Wrote Hadoop proxy script
>>>>>>>>>>>> >> /Users/edmaroliveiraferreira/.whirr/hadoop/hadoop-proxy.sh
>>>>>>>>>>>> >> 2012-02-23 16:04:30,246 INFO
>>>>>>>>>>>> >>
>>>>>>>>>>>>  [org.apache.whirr.service.hadoop.HadoopJobTrackerClusterActionHandler]
>>>>>>>>>>>> >> (main) Completed configuration of
hadoop role
>>>>>>>>>>>> hadoop-jobtracker
>>>>>>>>>>>> >> 2012-02-23 16:04:30,246 INFO
>>>>>>>>>>>> >>
>>>>>>>>>>>>  [org.apache.whirr.service.hadoop.HadoopJobTrackerClusterActionHandler]
>>>>>>>>>>>> >> (main) Jobtracker web UI available
at
>>>>>>>>>>>> >> http://ec2-23-20-110-12.compute-1.amazonaws.com:50030
>>>>>>>>>>>> >> 2012-02-23 16:04:30,246 INFO
>>>>>>>>>>>> >>
>>>>>>>>>>>>  [org.apache.whirr.service.hadoop.HadoopDataNodeClusterActionHandler]
(main)
>>>>>>>>>>>> >> Completed configuration of hadoop
role hadoop-datanode
>>>>>>>>>>>> >> 2012-02-23 16:04:30,246 INFO
>>>>>>>>>>>> >>
>>>>>>>>>>>>  [org.apache.whirr.service.hadoop.HadoopTaskTrackerClusterActionHandler]
>>>>>>>>>>>> >> (main) Completed configuration of
hadoop role
>>>>>>>>>>>> hadoop-tasktracker
>>>>>>>>>>>> >> 2012-02-23 16:04:30,253 INFO
>>>>>>>>>>>> >>  [org.apache.whirr.actions.ScriptBasedClusterAction]
(main)
>>>>>>>>>>>> Finished running
>>>>>>>>>>>> >> start phase scripts on all cluster
instances
>>>>>>>>>>>> >> 2012-02-23 16:04:30,257 DEBUG
>>>>>>>>>>>> [org.apache.whirr.service.ComputeCache]
>>>>>>>>>>>> >> (Thread-3) closing ComputeServiceContext
{provider=aws-ec2,
>>>>>>>>>>>> >> endpoint=https://ec2.us-east-1.amazonaws.com,
>>>>>>>>>>>> apiVersion=2010-06-15,
>>>>>>>>>>>> >> buildVersion=, identity=08WMRG9HQYYGVQDT57R2,
>>>>>>>>>>>> iso3166Codes=[US-VA, US-CA,
>>>>>>>>>>>> >> US-OR, BR-SP, IE, SG, JP-13]}
>>>>>>>>>>>> >>
>>>>>>>>>>>> >>
>>>>>>>>>>>> >>
>>>>>>>>>>>> >>
>>>>>>>>>>>> >> On Thu, Feb 23, 2012 at 4:31 PM,
Andrei Savu <
>>>>>>>>>>>> savu.andrei@gmail.com>
>>>>>>>>>>>> >> wrote:
>>>>>>>>>>>> >>>
>>>>>>>>>>>> >>> I think it's the first time
I see this. Anything
>>>>>>>>>>>> interesting in the
>>>>>>>>>>>> >>> logs?
>>>>>>>>>>>> >>>
>>>>>>>>>>>> >>>
>>>>>>>>>>>> >>> On Thu, Feb 23, 2012 at 6:27
PM, Edmar Ferreira
>>>>>>>>>>>> >>> <edmaroliveiraferreira@gmail.com>
wrote:
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>> Hi guys,
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>> When I launch a cluster
and run the proxy everything seems
>>>>>>>>>>>> to be right,
>>>>>>>>>>>> >>>> but when I try to use any
command in hadoop I get this
>>>>>>>>>>>> error:
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>> Bad connection to FS. command
aborted.
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>> Any suggestions ?
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>> Thanks
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>> --
>>>>>>>>>>>> >>>> Edmar Ferreira
>>>>>>>>>>>> >>>> Co-Founder at Everwrite
>>>>>>>>>>>> >>>>
>>>>>>>>>>>> >>>
>>>>>>>>>>>> >>
>>>>>>>>>>>> >>
>>>>>>>>>>>> >>
>>>>>>>>>>>> >> --
>>>>>>>>>>>> >> Edmar Ferreira
>>>>>>>>>>>> >> Co-Founder at Everwrite
>>>>>>>>>>>> >>
>>>>>>>>>>>> >
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> thanks
>>>>>>>>>>>> ashish
>>>>>>>>>>>>
>>>>>>>>>>>> Blog: http://www.ashishpaliwal.com/blog
>>>>>>>>>>>> My Photo Galleries: http://www.pbase.com/ashishpaliwal
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Edmar Ferreira
>>>>>>>>>>> Co-Founder at Everwrite
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Edmar Ferreira
>>>>>>>>> Co-Founder at Everwrite
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Edmar Ferreira
>>>>>>> Co-Founder at Everwrite
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Edmar Ferreira
>>>>> Co-Founder at Everwrite
>>>>>
>>>>>
>>>>
>>>
>>
>>
>> --
>> Edmar Ferreira
>> Co-Founder at Everwrite
>>
>>
>

Mime
View raw message