nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bartosz Gadzimski <bartek...@o2.pl>
Subject Re: login failed exception
Date Tue, 14 Apr 2009 08:09:22 GMT
Hello Frank,

Yes, it is memory issue you must increase java heap size.

Just follow this instructions (another things to add to wiki ;)

Eclipse -> Window -> Preferences -> Java -> Installed JREs -> edit -> 
Default VM arguments

I've set mine to -Xms5m -Xmx150m because I have like 200MB RAM left 
after runnig all apps

-Xms (minimum ammount of RAM memory for running applications)
-Xmx (maximum)

It should help.

Thanks,
Bartosz

Frank McCown pisze:
> Hello Bartosz,
>
> I'm running the default Nutch 1.0 version on Windows XP (2 GB RAM)
> with Eclipse 3.3.0.  I followed the directions at
>
> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>
> exactly as stated.  I'm able to run the default Nutch 0.9 release
> without any problems in Eclipse.  But when I run 1.0, I always get the
> java.io.IOException as stated in my last email.  I had assumed it was
> due to the plugin issue, but maybe not.  I'm just running a very small
> crawl with two seed URLs.
>
> Here's what hadoop.log says:
>
> 2009-04-13 13:41:03,010 INFO  crawl.Crawl - crawl started in: crawl
> 2009-04-13 13:41:03,025 INFO  crawl.Crawl - rootUrlDir = urls
> 2009-04-13 13:41:03,025 INFO  crawl.Crawl - threads = 10
> 2009-04-13 13:41:03,025 INFO  crawl.Crawl - depth = 3
> 2009-04-13 13:41:03,025 INFO  crawl.Crawl - topN = 5
> 2009-04-13 13:41:03,479 INFO  crawl.Injector - Injector: starting
> 2009-04-13 13:41:03,479 INFO  crawl.Injector - Injector: crawlDb: crawl/crawldb
> 2009-04-13 13:41:03,479 INFO  crawl.Injector - Injector: urlDir: urls
> 2009-04-13 13:41:03,479 INFO  crawl.Injector - Injector: Converting
> injected urls to crawl db entries.
> 2009-04-13 13:41:03,588 WARN  mapred.JobClient - Use
> GenericOptionsParser for parsing the arguments. Applications should
> implement Tool for the same.
> 2009-04-13 13:41:06,105 WARN  mapred.LocalJobRunner - job_local_0001
> java.lang.OutOfMemoryError: Java heap space
> 	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:498)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> 	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)
>
>
> I have not tried Sanjoy's advice yet... it looks like this is a memory issue.
>
> Any advice would be much appreciated,
> Frank
>
>
> 2009/4/10 Bartosz Gadzimski <bartek--g@o2.pl>:
>   
>> Hello Frank,
>>
>> Please look into hadoop.log and let maybe there is something more.
>>
>> About your error - you must give us more specific configuration of your
>> nutch.
>>
>> Default nutch installation is working with no problems (I'v never changed
>> src/plugin path)
>>
>> Please tell us: version of nutch
>> any changes
>> different configurations (different then crawl-urlfilter - adding your
>> domain).
>>
>> Thanks,
>> Bartosz
>>
>> Frank McCown pisze:
>>     
>>> Adding cygwin to my PATH solved my problem with whoami.  But now I'm
>>> getting an exception when running the crawler:
>>>
>>> Injector: Converting injected urls to crawl db entries.
>>> Exception in thread "main" java.io.IOException: Job failed!
>>>        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
>>>        at org.apache.nutch.crawl.Injector.inject(Injector.java:160)
>>>        at org.apache.nutch.crawl.Crawl.main(Crawl.java:114)
>>>
>>> I know from searching the mailing list that this is normally due to a
>>> bad plugin.folders setting in the nutch-default.xml, but I used the
>>> same value as the tutorial (./src/plugin) to no avail.
>>>
>>> (As an aside, seems like Hadoop should provide a better error message
>>> if the plugin folder doesn't exist.)
>>>
>>> Anyway, thanks, Bartosz, for your help.
>>>
>>> Frank
>>>
>>>
>>> 2009/4/10 Bartosz Gadzimski <bartek--g@o2.pl>:
>>>
>>>       
>>>> Hello,
>>>>
>>>> So now you have to install cygwin and be sure that you add it to PATH
>>>>
>>>> it's in http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>>>
>>>> After this you should be able to run "bash" command from command prompt
>>>> (Menu Start > RUN > cmd.exe)
>>>>
>>>> Then you'r done - everything will be working.
>>>>
>>>> I must add it to wiki, I forgot about whoami problem.
>>>>
>>>> Take care,
>>>> Bartosz
>>>>
>>>> sanjoy.ghosh@thomsonreuters.com pisze:
>>>>
>>>>         
>>>>> Thanks for the suggestion Bartosz.  I downloaded whoami, and It promptly
>>>>> crashed on "bash".
>>>>>
>>>>> 09/04/10 12:02:28 WARN fs.FileSystem: uri=file:///
>>>>> javax.security.auth.login.LoginException: Login failed: Cannot run
>>>>> program "bash": CreateProcess error=2, The system cannot find the file
>>>>> specified
>>>>>       at
>>>>> org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
>>>>> nformation.java:250)
>>>>>       at
>>>>> org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
>>>>> nformation.java:275)
>>>>>       at
>>>>> org.apache.hadoop.security.UnixUserGroupInformation.login(UnixUserGroupI
>>>>> nformation.java:257)
>>>>>       at
>>>>> org.apache.hadoop.security.UserGroupInformation.login(UserGroupInformati
>>>>> on.java:67)
>>>>>       at
>>>>> org.apache.hadoop.fs.FileSystem$Cache$Key.<init>(FileSystem.java:1438)
>>>>>       at
>>>>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1376)
>>>>>       at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:215)
>>>>>       at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:120)
>>>>>       at org.apache.nutch.crawl.Crawl.main(Crawl.java:84)
>>>>>
>>>>> Where am I going to find "bash" on Windows without running commandline
>>>>> cygwin?  Is there a way to turn off this security in Hadoop?
>>>>>
>>>>> Thanks,
>>>>> Sanjoy
>>>>>
>>>>> -----Original Message-----
>>>>> From: Bartosz Gadzimski [mailto:bartek--g@o2.pl] Sent: Friday, April
10,
>>>>> 2009 5:06 AM
>>>>> To: nutch-dev@lucene.apache.org
>>>>> Subject: Re: login failed exception
>>>>>
>>>>> Hello,
>>>>>
>>>>> I am not sure if it's the case but you should try to add whoami to your
>>>>> windows box.
>>>>>
>>>>> for example for windows xp and sp2:
>>>>> http://www.microsoft.com/downloads/details.aspx?FamilyId=49AE8576-9BB9-4
>>>>> 126-9761-BA8011FABF38&displaylang=en
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Bartosz
>>>>>
>>>>> Frank McCown pisze:
>>>>>
>>>>>
>>>>>           
>>>>>> I've been running 0.9 in Eclipse on Windows for some time, and I
was
>>>>>> successful in running the NutchBean from version 1.0 in Eclipse,
but
>>>>>> the crawler gave me the same exception as it gave this individual.
>>>>>> Maybe there's something else I'm overlooking, but I followed the
>>>>>> Tutorial at
>>>>>>
>>>>>> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>>>>>
>>>>>> to a T.  I'll keep working on it though.
>>>>>>
>>>>>> Frank
>>>>>>
>>>>>>
>>>>>> 2009/4/10 Bartosz Gadzimski <bartek--g@o2.pl>:
>>>>>>
>>>>>>
>>>>>>             
>>>>>>> fmccown pisze:
>>>>>>>
>>>>>>>
>>>>>>>               
>>>>>>>> You must run Nutch's crawler using cygwin on Windows since
cygwin
>>>>>>>>
>>>>>>>>
>>>>>>>>                 
>>>>> has the
>>>>>
>>>>>
>>>>>           
>>>>>>>> whoami program.  If you run it from Eclipse on Windows, it
can't use
>>>>>>>> cygwin's whoami program and will fail with the exceptions
you saw.
>>>>>>>>
>>>>>>>>
>>>>>>>>                 
>>>>> This
>>>>>
>>>>>
>>>>>           
>>>>>>>> is
>>>>>>>> an unfortunately design decision in Hadoop which makes anything
>>>>>>>>
>>>>>>>>
>>>>>>>>                 
>>>>> after
>>>>>
>>>>>
>>>>>           
>>>>>>>> version 9.0 not work in Eclipse on Windows.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                 
>>>>>>> It's not true, please look at
>>>>>>> http://wiki.apache.org/nutch/RunNutchInEclipse0.9
>>>>>>>
>>>>>>> I am using nutch 1.0 with eclipse on windows with no problems.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Bartosz
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>               
>>>>>>             
>>>>>
>>>>>           
>>>       
>>     
>
>   


Mime
View raw message