commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gilles Sadowski <gillese...@gmail.com>
Subject Re: [RNG] Duplicated blocks?
Date Fri, 13 Sep 2019 21:10:37 GMT
Hi.

Le ven. 13 sept. 2019 à 22:34, Alex Herbert <alex.d.herbert@gmail.com> a écrit :
>
>
>
> > On 13 Sep 2019, at 15:27, Gilles Sadowski <gilleseran@gmail.com> wrote:
> >
> > Hello.
> >
> > SonarQube reports "duplicated blocks":
> >    https://sonarcloud.io/component_measures?id=commons-rng&metric=duplicated_blocks&view=list
> >
> > Does it report about the license header?
>
> No. If you click though to the source code (from a violation in the list) and scroll
you will see a bar in the left margin for the duplicated code. You can click on it and it
states where the duplicate is.
>
> There are 13 reported duplicates and it’s not always a spurious report. Some of the
code blocks are duplicates. In this case it is for code that performs work to compute more
than a single value. As such it is not a simple refactor to move the block into a method without
having to return an object encapsulating the multiple values computed; or pass in an array
to hold all the value computed; or have a common parent with shared state variable. In the
case of common code in the random generator next() method of Well19937a and Well44497a the
various workarounds to eliminate common code would have negative effects and is not worth
it IMO. In one case it is in classes within different packages (e.g. AbstractPcg6432 vs PcgRxsMXs64)
where the amount of duplicate code is actually the same but it is outside of the package structure
to have a common parent. That is unreleased code though so could be rearranged without breaking
binary compatibility.
>
> In about half of the cases it is a spurious report where:
>
> - the code block as text is identical but the types of the variables are different (e.g.
byte/short/int in the MarsagliaTsangWangDiscreteSampler) - in these cases I could try renaming
variables but it seems odd since the code is deliberately copied with modification only for
the data type.
> - the value of the constants are different on a single line of code that calls a method
(e.g. the AbstractXoRoShiRo128 vs AbstractXoShiRo256).
>
> In one case it is exactly the same algorithm (BoxMullerGaussianSampler vs BoxMullerNormalizedGaussianSampler);
here one is deprecated as it is a known duplicate.
>
> > If so, how to deactivate that spurious report?
>
> Don’t know. I’m currently OK with all the duplicates as they are. Ideally they could
be removed as exceptions without turning off the duplicates report but leaving them on is
fine.

IIRC it is possible to instruct SonarQube that a particular should be
considered as solved.
[I could do it when SonarQube was located at Apache; but one has to be
logged in (which
is reasonable for tracking purpose).]

>
> There are a fair number of other code smells (total = 107).
>
> We can get rid of 9 by renaming SEED to S in the commons/rng/simple/internal/ Converters.

I don't agree with this rule as it contradicts a more important one IMO:
self-documenting code.

> There are 33 for deprecation to ignore.
>
> There are 18 for tests that have no assertions. These are tests that check constructors
do not throw when self-seeded (i.e. input seed is too small). They could be updated to assert
the created object is not null. It seems pointless to do that. The methods are commented to
state they are checking construction paths. The generators could be tested to ensure they
do not return all zeros or the same number. But a full statistical test of the output seems
over the top.
>
> There are 15 'Extract the assignment out of this expression’ that I prefer as is in
MarsagliaTsangWangDiscreteSampler. But I did write it. I could change this and it would not
affect sampler performance as the code is all in the sampler creation method. The other 3
are in the generator next() methods and are written that way to match the reference code.
It is a style that I am fine with in these cases. If changed I would like to do performance
tests to check it doesn’t make the method slower to remove a coding style warning. So I
recommend not changing those for simplicity.

Fine.
Could you check that you can declare issues as "solved"?

Thanks,
Gilles

>
> 3 for commented out code in the pom.xml could be fixed.
>
> There are 2 ignored tests using zero array seeds. It is known that xor-shift based generators
cannot handle all zero seeds. These tests would never work for those generators (there are
18 such generators in the library). I suggest deleting these tests. An alternative would be
to test the first 50 values. If all zero then assume the generator is dead. Otherwise do the
statistical test on the generator. The test could be commented to state that if a generator
is outputting non-zero values from an all zero seed that the output should be random.
>
> The TODO comment in InternalGamma can be removed. It is 3 years old and there is not
currently a need to move this helper class to a different component.
>
> Then there are a few legitimate oddities for the remaining 10 or so. I can comment the
code so this can be seen when clicking through from Sonar.
>
> Doing all the fixes would remove 48 code smells. It would make looking at the Sonar report
easier.
>
> Alex
>
>
> >
> > Regards,
> > Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message