lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lucene/Solr QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-8497) Rethink multi-term analysis handling
Date Wed, 31 Oct 2018 01:10:00 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-8497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669476#comment-16669476
] 

Lucene/Solr QA commented on LUCENE-8497:
----------------------------------------

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m  0s{color}
| {color:green} The patch appears to include 7 new or modified test files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 16s{color} |
{color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  7s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m  7s{color} | {color:green}
the patch passed {color} |
| {color:green}+1{color} | {color:green} Release audit (RAT) {color} | {color:green}  0m 43s{color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Check forbidden APIs {color} | {color:green}  0m
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Validate source patterns {color} | {color:green}
 0m 27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 12s{color} | {color:green}
common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 28s{color} | {color:green}
icu in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 50s{color} | {color:green}
kuromoji in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 48m 35s{color} | {color:green}
core in the patch passed. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 61m 10s{color} | {color:black}
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | LUCENE-8497 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12946196/LUCENE-8497.patch
|
| Optional Tests |  compile  javac  unit  ratsources  checkforbiddenapis  validatesourcepatterns
 |
| uname | Linux lucene1-us-west 4.4.0-137-generic #163~14.04.1-Ubuntu SMP Mon Sep 24 17:14:57
UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | ant |
| Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-LUCENE-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh
|
| git revision | master / 856e28d |
| ant | version: Apache Ant(TM) version 1.9.3 compiled on July 24 2018 |
| Default Java | 1.8.0_172 |
|  Test Results | https://builds.apache.org/job/PreCommit-LUCENE-Build/114/testReport/ |
| modules | C: lucene/analysis/common lucene/analysis/icu lucene/analysis/kuromoji solr/core
U: . |
| Console output | https://builds.apache.org/job/PreCommit-LUCENE-Build/114/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Rethink multi-term analysis handling
> ------------------------------------
>
>                 Key: LUCENE-8497
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8497
>             Project: Lucene - Core
>          Issue Type: New Feature
>            Reporter: Alan Woodward
>            Priority: Major
>         Attachments: LUCENE-8497.patch, LUCENE-8497.patch, LUCENE-8497.patch, LUCENE-8497.patch
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> The current framework for handling term normalisation works via instanceof checks for
MultiTermAwareComponent and casts.  MultiTermAwareComponent itself deals in AbstractAnalysisComponents,
and so callers need to cast to the correct component type before use, which is ripe for misuse.
> We should re-organise all this to be type-safe and usable without casts.  One possibility
is to add `normalize` methods to CharFilterFactory and TokenFilterFactory that mirror their
existing `create` methods.  The default implementation would return the input unchanged,
while filters that should apply at normalization time can delegate to `create`.
> Related to this, we should deprecate and remove LowerCaseTokenizer, which combines tokenization
and normalization in a way that will break this API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message