lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <>
Subject [jira] [Commented] (LUCENE-8752) Apply a patch to kuromoji dictionary to properly handle Japanese new era '令和' (REIWA)
Date Sun, 14 Apr 2019 00:03:00 GMT


ASF subversion and git services commented on LUCENE-8752:

Commit a4ba4b0b7c81c10c0f078a094a0ad1ba3453d633 in lucene-solr's branch refs/heads/branch_8x
from Uwe Schindler
[;h=a4ba4b0 ]

LUCENE-8752: Add license header to patch file

Revert "LUCENE-8752: Fix precommit error: patch files cannot have a license header" - This
reverts commit b60548f6d88bc1e3bba9916fc19d1c90b6505e28.

> Apply a patch to kuromoji dictionary to properly handle Japanese new era '令和' (REIWA)
> -------------------------------------------------------------------------------------
>                 Key: LUCENE-8752
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/analysis
>            Reporter: Tomoko Uchida
>            Assignee: Tomoko Uchida
>            Priority: Minor
>             Fix For: 8.1, master (9.0)
>         Attachments: LUCENE-8752.patch
>          Time Spent: 20m
>  Remaining Estimate: 0h
> As of May 1st, 2019, Japanese era '元号' (Gengo) will be set to '令和' (Reiwa). See
this article for more details:
> []
> Currently '令和' is splitted up to '令' and '和' by {{JapaneseTokenizer}}. It should
be tokenized as one word so that Japanese texts including era names are searched as users
expect. Because the default Kuromoji dictionary (mecab-ipadic) has not been maintained since
2007, a one-line patch to the source CSV file is needed for this era change.
> Era name is used in many official or formal documents in Japan, so it would be desirable
the search systems properly handle this without adding a user dictionary or using phrase query.
> FYI, JDK DateTime API will support the new era (in the next updates.)
> []
> The patch is available here:
> []

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message