nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sami Siren <ssi...@gmail.com>
Subject Re: svn commit: r485076 - in /lucene/nutch/trunk/src: java/org/apache/nutch/metadata/SpellCheckedMetadata.java test/org/apache/nutch/metadata/TestSpellCheckedMetadata.java
Date Sun, 10 Dec 2006 09:52:37 GMT
Chris Mattmann wrote:
>  Indeed, I see your point. I guess what I was advocating for was more of a
> ProtocolHeaders interface, that lives in org.apache.nutch.metadata. Then, we
> could update the code that you have below to use ProtocolHeaders.class
> rather than HttpHeaders.class. We would then make ProtocolHeaders extend
> HttpHeaders, so that it by default inherits all of the HttpHeaders, while
> still allowing more ProtocolHeader met keys (e.g., we could have an
> interface for FileHeaders, etc.).

Yes, I am ok with adding more words to SDMetadata if required. Do you
have a concrete example of those FileHeaders you are planning to add?

>  What do you think about that? Alternatively we could just create a
> ProtocolHeaders interface in org.apache.nutch.metadata that aggreates all
> the met key fields from HttpHeaders, and it would be the place that the met
> key fields for FileHeaders, etc. could go into.

You don't actually need to hierarchically construct interfaces for
constants as I changed the SCMetadata to initialize itself with array of
classes.

The optimization I made is not so significant from the big perspective
so if there's really objections on it, it can also be reverted.

However my original opinion haven't really changed: We probably should
move the Spell checking feature to static utility method so it can be
used when needed (probably also with customizable, context optimize able
dictionary). This way it could also be used in non metadata context.

--
 Sami Siren

Mime
View raw message