nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki>
Subject Re: [jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata
Date Thu, 26 Jan 2006 21:22:19 GMT
Doug Cutting (JIRA) wrote:
>     [ ]


My apologies for commenting here - JIRA produces broken HTML for me, I 
can't use it...

> Doug Cutting commented on NUTCH-139:
> ------------------------------------
> I think we're near agreement here.
> Here are the changes I think this patch still needs:
> MetadataNames belongs in the protocol package, not util.

Erhm.. please bear with me. I'd rather see these two classes in a 
separate package altogether, org.apache.nutch.metadata. The reason is 
that most likely these two classes will be used elsewhere too, not just 
in the protocol and parse/fetch related context. I'm specifically 
referring to the CrawlData.

> We should rename ContentProperties to Metadata.


> We should add an add() method to Metadata, and change set() to replace all values rather
than add a new value.  Protocol code which creates properties from headers should then use


> We could commit after simply moving MetadataNames to protocol, and leave the changes
to ContentProperties for another commit, but I'd prefer it all be done together.

Either way is fine with me. Perhaps splitting this into two commits 
would make it easier to fix potential breakage...

Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration  Contact: info at sigram dot com

View raw message