nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ".: Abhishek :." <ab1s...@gmail.com>
Subject Implementing a negative keyword filter in index
Date Tue, 01 Feb 2011 04:40:55 GMT
Hi all,

 I posted this question in user@nutch.apache.org, sorry if I am duplicating
but thought it would be good to post here as well.

 I am planning to implement a negative keyword indexer such that if a
negative keyword appears in a segment I should never show up it during the
search. I have the following steps in mind, please let me know if its right.

   - Writing a plug-in
      - Extend the IndexingFilter.
      - Do a NutchDocument.removeField for the negative keyword.
      - return the doc

  Now the questions are,

   - The NutchDocument is always mapped as a HTML page, so if I am doing the
   thing above, Am I really removing the segment from getting indexed or am I
   preventing the page from being indexed?

 Also, please let me know what I am intending to do is right? Thanks again
all for your time.

Cheers,
Abhi

Mime
View raw message