lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <>
Subject [jira] Updated: (LUCENE-1728) Move SmartChineseAnalyzer & resources to own contrib project
Date Thu, 16 Jul 2009 18:50:14 GMT


Robert Muir updated LUCENE-1728:

    Attachment: LUCENE-1728.txt

Simon, I revised the patch. Here are the new instructions for the analyzers/common and analyzers/smartcn
Sorry for the delay.

## 1. clean svn checkout
## 2. run the following commands to refactor the files.

mkdir contrib/analyzers/common
mkdir -p contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn contrib/analyzers/smartcn/src/test/org/apache/lucene/analysis/cn
svn add contrib/analyzers/smartcn contrib/analyzers/common
svn move contrib/analyzers/src/java/org/apache/lucene/analysis/cn/
svn move contrib/analyzers/src/java/org/apache/lucene/analysis/cn/smart/hhmm/* contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn
svn move contrib/analyzers/src/java/org/apache/lucene/analysis/cn/smart/*.java contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn
svn delete contrib/analyzers/src/java/org/apache/lucene/analysis/cn/smart
svn move contrib/analyzers/src/test/org/apache/lucene/analysis/cn/
svn move contrib/analyzers/src/resources/org/apache/lucene/analysis/cn/stopwords.txt contrib/analyzers/smartcn/src/resources/org/apache/lucene/analysis/cn
svn move contrib/analyzers/src/resources/org/apache/lucene/analysis/cn/smart/hhmm/* contrib/analyzers/smartcn/src/resources/org/apache/lucene/analysis/cn
svn delete contrib/analyzers/src/resources/org/apache/lucene/analysis/cn
svn move contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/
svn move contrib/analyzers/build.xml contrib/analyzers/common
svn move contrib/analyzers/pom.xml.template contrib/analyzers/common
svn move contrib/analyzers/src contrib/analyzers/common

## 3. eclipse "refresh" at project level.
## 4. set text-file encoding at project level to UTF-8
## 5. manually force text-file encoding as UTF-8 for contrib/analyzers/common/src/java/org/apache/lucene/analysis/cn/package.html
##   this is an existing encoding issue that is corrected by this patch.
## 6. apply patch from clipboard (you may now remove the above hack and you will notice this
file is now detected properly as UTF-8)

> Move SmartChineseAnalyzer & resources to own contrib project
> ------------------------------------------------------------
>                 Key: LUCENE-1728
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/analyzers
>            Reporter: Simon Willnauer
>            Assignee: Simon Willnauer
>            Priority: Minor
>             Fix For: 2.9
>         Attachments: LUCENE-1728.txt, LUCENE-1728.txt
> SmartChineseAnalyzer depends on  a large dictionary that causes the analyzer jar to grow
up to 3MB. The dictionary is quite big compared to all the other resouces / class files contained
in that jar. 
> Having a separate analyzer-cn contrib project enables footprint-sensitive users (e.g.
using lucene on a mobile phone) to include analyzer.jar without getting into trouble with
disk space.
> Moving SmartChineseAnalyzer to a separate project could also include a small refactoring
as Robert mentioned in [LUCENE-1722|] several
classes should be package protected, members and classes could be final, commented syserr
and logging code should be removed etc.
> I set this issue target to 2.9 - if we can not make it until then feel free to move it
to 3.0

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message