nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jérôme Charron" <jerome.char...@gmail.com>
Subject Re: implement thai language indexing and search
Date Tue, 28 Nov 2006 21:56:56 GMT
> i used an existing ThaiAnalyzer which was in lucene package.
> ok - i renamed the lucene.analysis.th.* to nutch.analysis.th.* - compiled
> and
> placed all class files in a jar - analysis-th.jar (do i need to bundle the
> ngp file in the jar as well ?)

1. You don't have to refactor the lucene analyzer. Just to wrap it like I do
with french and german analyzers (they both use some analyzers from lucene).
 2. Analyzer doesn't need ngp files... I think you misunderstood something:
2.1 In one side there is the language identifier that use NGP files to
identify language of a document
2.2 In the other sided if a suitable analyzer is found for the identified
language, it is used to analyze the document.

Regards

Jérôme


-- 
http://motrech.free.fr/
http://www.frutch.org/

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message