commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex D Herbert (JIRA)" <j...@apache.org>
Subject [jira] [Created] (RNG-55) UnitSphereSampler
Date Wed, 15 Aug 2018 15:29:00 GMT
Alex D Herbert created RNG-55:
---------------------------------

             Summary: UnitSphereSampler
                 Key: RNG-55
                 URL: https://issues.apache.org/jira/browse/RNG-55
             Project: Commons RNG
          Issue Type: Bug
          Components: sampling
    Affects Versions: 1.1
            Reporter: Alex D Herbert


The {{UnitSphereSampler}} does not check for zero length dimension. This doesn't cause a
fail but maybe should be raised as an error? The user will find out when they call {{nextVector()}}
with an {{NegativeArraySizeException}} anyway.

However the algorithm can fail under the extremely unlikely condition that the {{NormalizedGaussianSampler}}
returns {{0}} for each dimension. This is more likely when using {{dimension==1}}.

The returned vector will be {{Double.NaN}} for each dimension due to the use of {{1 / Math.sqrt(0)}}:
{code:java}
// f will be Double.POSITIVE_INFINITY
final double f = 1 / Math.sqrt(normSq);
for (int i = 0; i < dimension; i++) {
    // v will be Double.NaN as 0 * Double.POSITIVE_INFINITY is Double.NaN
    v[i] *= f;
}

{code}
Here is a test that demonstrates it. All it requires in the RNG to provide a long of 0 to
the internal {{ZigguratNormalizedGaussianSampler}}:
{code:java}
@Test
public void testBadRNG() {
    final UniformRandomProvider rng = new UniformRandomProvider() {
        public long nextLong(long n) { return 0; }
        public long nextLong() { return 0; }
        public int nextInt(int n) { return 0; }
        public int nextInt() { return 0; }
        public float nextFloat() { return 0; }
        public double nextDouble() { return 0;}
        public void nextBytes(byte[] bytes, int start, int len) {}
        public void nextBytes(byte[] bytes) {}
        public boolean nextBoolean() { return false; }
    };
    UnitSphereSampler s = new UnitSphereSampler(1, rng);
    double[] v = s.nextVector();
    Assert.assertNotNull(v);
    Assert.assertEquals(1, v.length);
    Assert.assertTrue(Double.isNaN(v[0]));
}
{code}
This can be fixed by an outer loop:
{code:java}
/**
 * @return a random normalized Cartesian vector.
 */
public double[] nextVector() {
    final double[] v = new double[dimension];

    // Pick a point by choosing a standard Gaussian for each element,
    // and then normalize to unit length.
    double normSq = 0;
    while (normSq == 0) {
        for (int i = 0; i < dimension; i++) {
            final double comp = sampler.sample();
            v[i] = comp;
            normSq += comp * comp;
        }
    }

    final double f = 1 / Math.sqrt(normSq);
    for (int i = 0; i < dimension; i++) {
        v[i] *= f;
    }

    return v;
}
{code}
But this can then infinite loop if the {{NormalizedGaussianSampler}} always returns 0 (as
for the dummy test case above).

Q. What is the lesser evil of a vector with NaN (as with the current implementation) or never
returning in an extreme edge case?

Throwing an exception would change the API.

Returning the vector [1,0,0,....] would fix the edge case but not alert the user to a broken
RNG:
{code:java}
/**
 * @return a random normalized Cartesian vector.
 */
public double[] nextVector() {
    final double[] v = new double[dimension];

    // Pick a point by choosing a standard Gaussian for each element,
    // and then normalize to unit length.
    double normSq = 0;
    for (int i = 0; i < dimension; i++) {
        final double comp = sampler.sample();
        v[i] = comp;
        normSq += comp * comp;
    }
    
    if (normSq == 0) {
        // Extreme edge case
        if (dimension != 0)
            v[0] = 1;
        return v;
    }

    final double f = 1 / Math.sqrt(normSq);
    for (int i = 0; i < dimension; i++) {
        v[i] *= f;
    }

    return v;
}
{code}

This is all extremely unlikely however I noticed as I was reviewing new classes in V1.1 and
saw this problem that I have hit before in a random walk simulation.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message