nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Jelsma <markus.jel...@openindex.io>
Subject Re: Running Issue about Nutch 1.3
Date Thu, 03 Nov 2011 19:13:55 GMT
Hi

Please use the user@nutch mailing list for user-related questions. This is for 
development of Nutch itself.

Cheers

> Hello dear :
> 
>                    I have the following running information from
> "hadoop.log" when I configured Nutch 1.3 in Eclipse (Win 7), but I don't
> know how to resolve it ,Can you help me . I'm new to nutch , so forgive me
> for some mistakes of using wrong terminology!
> 
> 
> 
> 2011-11-03 16:51:53,300 WARN  crawl.Crawl - solrUrl is not set, indexing
> will be skipped...
> 
> 2011-11-03 16:51:53,502 INFO  crawl.Crawl - crawl started in: crawl
> 
> 2011-11-03 16:51:53,502 INFO  crawl.Crawl - rootUrlDir = urls
> 
> 2011-11-03 16:51:53,502 INFO  crawl.Crawl - threads = 4
> 
> 2011-11-03 16:51:53,502 INFO  crawl.Crawl - depth = 5
> 
> 2011-11-03 16:51:53,502 INFO  crawl.Crawl - solrUrl=null
> 
> 2011-11-03 16:51:53,502 INFO  crawl.Crawl - topN = 10
> 
> 2011-11-03 16:51:53,518 INFO  crawl.Injector - Injector: starting at
> 2011-11-03 16:51:53
> 
> 2011-11-03 16:51:53,518 INFO  crawl.Injector - Injector: crawlDb:
> crawl/crawldb
> 
> 2011-11-03 16:51:53,518 INFO  crawl.Injector - Injector: urlDir: urls
> 
> 2011-11-03 16:51:53,534 INFO  crawl.Injector - Injector: Converting
> injected urls to crawl db entries.
> 
> 2011-11-03 16:51:53,658 WARN  mapred.JobClient - No job jar file set.  User
> classes may not be found. See JobConf(Class) or JobConf#setJar(String).
> 
> 2011-11-03 16:51:54,267 INFO  plugin.PluginRepository - Plugins: looking
> in: E:\IdealTimes\WorkSpace\Nutch1.3\plugin
> 
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Plugin
> Auto-activation mode: [true]
> 
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Registered Plugins:
> 
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - the nutch core
> extension points (nutch-extensionpoints)
> 
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Basic URL
> Normalizer (urlnormalizer-basic)
> 
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Html Parse Plug-in
> (parse-html)
> 
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Basic Indexing
> Filter (index-basic)
> 
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - HTTP Framework
> (lib-http)
> 
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Pass-through URL
> Normalizer (urlnormalizer-pass)
> 
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Regex URL Filter
> (urlfilter-regex)
> 
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Http Protocol
> Plug-in (protocol-http)
> 
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Regex URL
> Normalizer (urlnormalizer-regex)
> 
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Tika Parser Plug-in
> (parse-tika)
> 
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - OPIC Scoring
> Plug-in (scoring-opic)
> 
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - CyberNeko HTML
> Parser (lib-nekohtml)
> 
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Anchor Indexing
> Filter (index-anchor)
> 
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Regex URL Filter
> Framework (lib-regex-filter)
> 
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Registered
> Extension-Points:
> 
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Nutch URL
> Normalizer (org.apache.nutch.net.URLNormalizer)
> 
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Nutch Protocol
> (org.apache.nutch.protocol.Protocol)
> 
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Nutch Segment Merge
> Filter (org.apache.nutch.segment.SegmentMergeFilter)
> 
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Nutch URL Filter
> (org.apache.nutch.net.URLFilter)
> 
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Nutch Indexing
> Filter (org.apache.nutch.indexer.IndexingFilter)
> 
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - HTML Parse Filter
> (org.apache.nutch.parse.HtmlParseFilter)
> 
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Nutch Content
> Parser (org.apache.nutch.parse.Parser)
> 
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Nutch Scoring
> (org.apache.nutch.scoring.ScoringFilter)
> 
> 2011-11-03 16:51:54,345 WARN  net.URLNormalizers -
> URLNormalizers:PluginRuntimeException when initializing url normalizer
> plugin urlnormalizer-basic instance in getURLNormalizers function:
> attempting to continue instantiating plugins
> 
> 2011-11-03 16:51:54,360 WARN  net.URLNormalizers -
> URLNormalizers:PluginRuntimeException when initializing url normalizer
> plugin urlnormalizer-regex instance in getURLNormalizers function:
> attempting to continue instantiating plugins
> 
> 2011-11-03 16:51:54,360 WARN  net.URLNormalizers -
> URLNormalizers:PluginRuntimeException when initializing url normalizer
> plugin urlnormalizer-pass instance in getURLNormalizers function:
> attempting to continue instantiating plugins
> 
> 2011-11-03 16:51:54,360 WARN  mapred.LocalJobRunner - job_local_0001
> 
> java.lang.RuntimeException: Error in configuring object
> 
>          at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
> 
>          at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
> 
>          at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117
> )
> 
>          at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:354)
> 
>          at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
> 
>          at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> 
> Caused by: java.lang.reflect.InvocationTargetException
> 
>          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 
>          at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
> 
>          at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
> 
>          at java.lang.reflect.Method.invoke(Unknown Source)
> 
>          at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
> 
>          ... 5 more
> 
> Caused by: java.lang.RuntimeException: Error in configuring object
> 
>          at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
> 
>          at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
> 
>          at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117
> )
> 
>          at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
> 
>          ... 10 more
> 
> Caused by: java.lang.reflect.InvocationTargetException
> 
>          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 
>          at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
> 
>          at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
> 
>          at java.lang.reflect.Method.invoke(Unknown Source)
> 
>          at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
> 
>          ... 13 more
> 
> Caused by: java.lang.RuntimeException:
> org.apache.nutch.plugin.PluginRuntimeException:
> java.lang.ClassNotFoundException:
> org.apache.nutch.urlfilter.regex.RegexURLFilter
> 
>          at org.apache.nutch.net.URLFilters.<init>(URLFilters.java:77)
> 
>          at
> org.apache.nutch.crawl.Injector$InjectMapper.configure(Injector.java:72)
> 
>          ... 18 more
> 
> Caused by: org.apache.nutch.plugin.PluginRuntimeException:
> java.lang.ClassNotFoundException:
> org.apache.nutch.urlfilter.regex.RegexURLFilter
> 
>          at
> org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:166)
> 
>          at org.apache.nutch.net.URLFilters.<init>(URLFilters.java:57)
> 
>          ... 19 more
> 
> Caused by: java.lang.ClassNotFoundException:
> org.apache.nutch.urlfilter.regex.RegexURLFilter
> 
>          at java.net.URLClassLoader$1.run(Unknown Source)
> 
>          at java.net.URLClassLoader$1.run(Unknown Source)
> 
>          at java.security.AccessController.doPrivileged(Native Method)
> 
>          at java.net.URLClassLoader.findClass(Unknown Source)
> 
>          at java.lang.ClassLoader.loadClass(Unknown Source)
> 
>          at java.lang.ClassLoader.loadClass(Unknown Source)
> 
>          at
> org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:156)
> 
>          ... 20 more
> 
> 
> 
> 
> 
> Best withes !
> 
> 
> 
> Skiming_zhang

Mime
View raw message