nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Julien Nioche (JIRA)" <j...@apache.org>
Subject [jira] Commented: (NUTCH-826) Mailing list is broken.
Date Mon, 24 May 2010 08:32:24 GMT

    [ https://issues.apache.org/jira/browse/NUTCH-826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870528#action_12870528
] 

Julien Nioche commented on NUTCH-826:
-------------------------------------

Nutch has recently become a TLP and some of the info on the website needs updating.

To subscribe to the list, send a message to:
  <user-subscribe@nutch.apache.org>

To remove your address from the list, send a message to:
  <user-unsubscribe@nutch.apache.org>

Send mail to the following for info and FAQ for this list:
  <user-info@nutch.apache.org>
  <user-faq@nutch.apache.org>

PS : this is hardly a blocker 

> Mailing list is broken.
> -----------------------
>
>                 Key: NUTCH-826
>                 URL: https://issues.apache.org/jira/browse/NUTCH-826
>             Project: Nutch
>          Issue Type: Bug
>            Reporter: John Sherwood
>            Priority: Blocker
>
> All of the following addresses are failing:
> nutch-user@nutch.apache.org
> nutch-user-subscribe@nutch.apache.org
> nutch-user-subscribe@lucene.apache.org
> For the last one, the mailer daemon said 
> "This mailing list has moved to user at nutch.apache.org."
> Below is the message I tried to send:
> Hi people,
> I've been banging my head against this problem for two days now.
> Simply, I want to add a field with the value of a given meta tag.
> I've been trying the parse-xml plugin, but that seems that it doesn't
> work with version 1.0.  I've tried the code at
> http://sujitpal.blogspot.com/2009/07/nutch-getting-my-feet-wet.html
> and it hasn't worked.  I don't even know why.  I don't even know if my
> plugin is being used... or even looked for!  Nutch seems to have a
> infuriating "Fail silently" policy for plugins.  I put a
> System.exit(1) in my filters just to see if my code is even being
> encountered.  It has not in spite of my config telling it to.
> Here's my config:
> nutch-site.xml
> ...
> <property>
>  <name>plugin.includes</name>
>  <value>protocol-http|urlfilter-regex|parse-html|index-(basic|anchor)|query-(basic|site|url)|response-(json|xml)|summary-basic|scoring-opic|urlnormalizer-(pass|regex|basic)|metadata</value>
> </property>
> ...
> parse-plugins.xml
> ...
> <mimeType name="application/xhtml+xml">
>    <plugin id="parse-html" />
>    <plugin id="metadata" />
> </mimeType>
> <mimeType name="text/html">
>       <plugin id="parse-html" />
>       <plugin id="metadata" />
> </mimeType>
> <mimeType name="text/sgml">
>       <plugin id="parse-html" />
>       <plugin id="metadata" />
> </mimeType>
> <mimeType name="text/xml">
>          <plugin id="parse-html" />
>          <plugin id="parse-rss" />
>         <plugin id="metadata" />
>         <plugin id="feed" />
> </mimeType>
> ...
> <alias name="metadata"
> extension-id="com.example.website.nutch.parsing.MetaTagExtractorParseFilter"
> />
> ...
> I've also copied the plugin.xml and jar from my build/metadata to the
> plugins root dir.
> Nonetheless, Nutch runs and puts data in solr for me.  Afaik, Nutch is
> completely unaware of my plugin despite my config options.  Is the
> some other place I need to tell Nutch to use my plugin?  Is there some
> other approach to do this without having to write a plugin?  This does
> seem like a lot of work to simply get a meta tag into a field.  Any
> help would be appreciated.
> Sincerely,
> John Sherwood

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message