Piotr Kosiorowski wrote:
>Hi,
>I started to think about implementing special kind of Lucene Query (if I
>remember correctly I would have to write my own Scorer and probably a few
>other classes) optimized for Nutch some time ago. I assumed having
>specialized query I would be able to avoid accessing some of lucene index
>structures multiple times as the same term apears many times in query
>generated by Nutch for multitoken queries. I am not an Lucene expert but
>maybe it is worth checking if it might give some performance boost. Has
>anyone any ideas why it might help or not?
>
>
That's a very good comment. Looking at the profile traces I can see that
a lot of time is spent just juggling the sub-query scorers inside the
BooleanScorer, and handling the complex query structure; if this part
could be optimized by the use of a special scorer, it could be a big win.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com
|