lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eduard Moraru <enygma2...@gmail.com>
Subject Which one is it "cs" or "cz" for Czech language?
Date Tue, 17 Mar 2015 17:35:49 GMT
Hi,

First of all, a bit of a disclaimer: I am not a Czech language speaker, at
all.

We are using Solr's dynamic fields in our project (XWiki), and we have
recently noticed a problem [1] with the Czech language.

Basically, our mapping says something like this:

<dynamicField name="*_cz" type="text_cz" indexed="true" stored="true"
multiValued="true" />

...but at runtime, we ask for the language code "cs" (which is the ISO
language code for Czech [2]) and it obviously fails (due to the mapping).

Now, we can easily fix this on our end by fixing the mapping to name="*_cs",
but what we are really wondering now is why does Lucene/Solr use "cz"
(country code) instead of "cs" (language code) in both its "text_cz" field
and its "stopwords_cz.txt" file?

Is that a mistake on the Solr/Lucene side? Is it some kind of convention?
Is it going to be fixed?

Thanks,
Eduard

----------
[1] http://jira.xwiki.org/browse/XWIKI-11897
[2] http://en.wikipedia.org/wiki/Czech_language

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message