nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dennis Kubes <ku...@apache.org>
Subject Re: Plugin Help
Date Sat, 14 Nov 2009 19:53:56 GMT
It depends on how you are building and your classpath.  Lets call your 
plugin myhtmlfilter.  If running on a single server and you added it to 
your src/plugin/build.xml under the deploy section, a myhtmlfilter 
folder with the plugin should show up in under the build/plugins folder 
upon build.  Then you would just have to copy over that myhtmlfilter 
folder to where your deployment plugins directory.

If running on a cluster, even in pseudo-distributed mode you would need 
to copy over the nutch-*.job file.  It has the plugins inside of it and 
it gets distributed out to the cluster.  If referencing from a webapp or 
the nutch war file, you would need to copy to web-inf/classes/plugins.

Dennis

david.stuart@progressivealliance.co.uk wrote:
>   Hi,
> 
> I am trying to write a plugin for nutch and am having real troubles 
> getting it registered in the system. I have created in src/plugin and 
> added it to both the build.xml in plugin and to nutch-site.xml now it 
> seems to build ok but when I try to run a basic crawl urls -dir crawl 
> -depth 3 -topN 2 I see the plugin registered in the hadoop.log
> 
> 2009-11-14 14:57:45,739 INFO  plugin.PluginRepository -  Html Filter 
> Parse Plug-in (parse-htmlfilter)
> 
> But then get the error message below. I have followed all of the 
> tutorials but they are mostly for nutch 0.9 and have error in them which 
> I have worked through
> 
> Thanks for your help
> 
> regards,
> Dave
> java.lang.RuntimeException: 
> org.apache.nutch.plugin.PluginRuntimeException: 
> java.lang.ClassNotFoundException: 
> org.apache.nutch.parse.htmlfilter.HtmlfilterIndexer
>         at 
> org.apache.nutch.indexer.IndexingFilters.<init>(IndexingFilters.java:100)
>         at 
> org.apache.nutch.indexer.IndexerMapReduce.configure(IndexerMapReduce.java:61)
>         at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
>         at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:83)
>         at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
>         at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:58)
>         at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:83)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:338)
>         at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)
> Caused by: org.apache.nutch.plugin.PluginRuntimeException: 
> java.lang.ClassNotFoundException: 
> org.apache.nutch.parse.htmlfilter.HtmlfilterIndexer
>         at 
> org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:166)
>         at 
> org.apache.nutch.indexer.IndexingFilters.<init>(IndexingFilters.java:70)
>         ... 8 more
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.nutch.parse.htmlfilter.HtmlfilterIndexer
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:319)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:254)
>         at 
> org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:156)

Mime
View raw message