nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jack Tang <him...@gmail.com>
Subject Nutch Query
Date Wed, 15 Jun 2005 10:27:03 GMT
Hi All

I have customized some query filters in passed two weeks.
And one question here. As I mentioned in my previous email, the target
website is made up of two part: text-only and graphic. My goal is to
tag the index with "textonly" and "graphic". Here I show two
approaches to reach the goal. Both query filters implements
FieldQueryFilter.

1. Tagging the content(parse.getText()) with the name("textonly" and
"graphic"), so the query string should look like:
   textonly:queryString 
or 
   graphic:queryString

2. Adding another field whose name is "version", and the available
values are "textonly" and "graphic". So the query string looks like:
   version:textonly queryString 
or
   version:graphic queryString


In my eyes, if queryString is the same, the search result should be
the same. Right? But in my test, the later query filter show all
textonly/graphic pages and ignore the queryString. The first one seems
OK.

So, can someone explain it more?

BTW:
In Query.class
Query: version:graphic file
Parsed: version:graphic file
Translated: +version:graphic +(url:file^4.0 anchor:file^2.0 content:file)

Regards
/Jack

Mime
View raw message