nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki ...@getopt.org>
Subject Re: [jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata
Date Thu, 26 Jan 2006 21:22:19 GMT
Doug Cutting (JIRA) wrote:
>     [ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12364125 ]

>
>   

My apologies for commenting here - JIRA produces broken HTML for me, I 
can't use it...

> Doug Cutting commented on NUTCH-139:
> ------------------------------------
>
> I think we're near agreement here.
>
> Here are the changes I think this patch still needs:
>
> MetadataNames belongs in the protocol package, not util.
>   

Erhm.. please bear with me. I'd rather see these two classes in a 
separate package altogether, org.apache.nutch.metadata. The reason is 
that most likely these two classes will be used elsewhere too, not just 
in the protocol and parse/fetch related context. I'm specifically 
referring to the CrawlData.

> We should rename ContentProperties to Metadata.
>   

+1.

> We should add an add() method to Metadata, and change set() to replace all values rather
than add a new value.  Protocol code which creates properties from headers should then use
add().
>   

+1

> We could commit after simply moving MetadataNames to protocol, and leave the changes
to ContentProperties for another commit, but I'd prefer it all be done together.
>   

Either way is fine with me. Perhaps splitting this into two commits 
would make it easier to fix potential breakage...

-- 
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



Mime
View raw message