commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bruno P. Kinoshita" <>
Subject Re: [Text] JaccardSimilarity
Date Thu, 07 Mar 2019 21:18:16 GMT
 Hi Alex,
Can't recall why it was done that way. When the initial code for the edit distances was created,
some Java libraries like Simmetrics, java-string-similarity, Lucene, and also R/Python code
were used to verify the output of the edit distances.
Maybe we used Math.round just to get a test passing, which I agree it had to be documented.
But even better if we just drop the Math.round and instead update the tests with that assertEquals(expected,
actual, threshold) method, with a good enough threshold.
What do you think?

    On Friday, 8 March 2019, 4:49:52 am NZDT, Alex Herbert <>
 A quick question about the JaccardSimilarity class:

Q. Why does it round the similarity to 2 decimal places?

This is not documented.

It is also done in the complimentary JaccardDistance class.

Looking at the history in git it seems to have always been that way. 
First commit was 2016-11-27.



To unsubscribe, e-mail:
For additional commands, e-mail:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message