lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Olli Kuonanoja (JIRA)" <>
Subject [jira] [Created] (LUCENE-8501) An ability to define the sum method for custom term frequencies
Date Mon, 17 Sep 2018 07:56:00 GMT
Olli Kuonanoja created LUCENE-8501:

             Summary: An ability to define the sum method for custom term frequencies
                 Key: LUCENE-8501
             Project: Lucene - Core
          Issue Type: Improvement
          Components: core/index
            Reporter: Olli Kuonanoja

Custom term frequencies allows expert users to index and score in custom ways, however, _DefaultIndexingChain_
adds a limitation to this as the sum of frequencies can't overflow
try {
    invertState.length = Math.addExact(invertState.length, invertState.termFreqAttribute.getTermFrequency());
} catch (ArithmeticException ae) {
    throw new IllegalArgumentException("too many tokens for field \"" + + "\"");
This might become an issue if for example the frequency data is encoded in a different way,
say the specific scorer works with float frequencies.

The sum method can be added to _TermFrequencyAttribute_ to get something like
invertState.length = invertState.termFreqAttribute.addFrequency(invertState.length);
so users may define the summing method and avoid the owerflow exceptions.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message