nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jack Tang <>
Subject Nutch Query
Date Wed, 15 Jun 2005 10:27:03 GMT
Hi All

I have customized some query filters in passed two weeks.
And one question here. As I mentioned in my previous email, the target
website is made up of two part: text-only and graphic. My goal is to
tag the index with "textonly" and "graphic". Here I show two
approaches to reach the goal. Both query filters implements

1. Tagging the content(parse.getText()) with the name("textonly" and
"graphic"), so the query string should look like:

2. Adding another field whose name is "version", and the available
values are "textonly" and "graphic". So the query string looks like:
   version:textonly queryString 
   version:graphic queryString

In my eyes, if queryString is the same, the search result should be
the same. Right? But in my test, the later query filter show all
textonly/graphic pages and ignore the queryString. The first one seems

So, can someone explain it more?

In Query.class
Query: version:graphic file
Parsed: version:graphic file
Translated: +version:graphic +(url:file^4.0 anchor:file^2.0 content:file)


View raw message