commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Herbert <alex.d.herb...@gmail.com>
Subject Re: [RNG] Duplicated blocks?
Date Fri, 13 Sep 2019 23:16:43 GMT


> On 13 Sep 2019, at 22:10, Gilles Sadowski <gilleseran@gmail.com> wrote:
> 
> Hi.
> 
> Le ven. 13 sept. 2019 à 22:34, Alex Herbert <alex.d.herbert@gmail.com <mailto:alex.d.herbert@gmail.com>>
a écrit :
>> 
>> 
>> 
>>> On 13 Sep 2019, at 15:27, Gilles Sadowski <gilleseran@gmail.com> wrote:
>>> 
>>> Hello.
>>> 
>>> SonarQube reports "duplicated blocks":
>>>   https://sonarcloud.io/component_measures?id=commons-rng&metric=duplicated_blocks&view=list
>>> 
>>> Does it report about the license header?
>> 
>> No. If you click though to the source code (from a violation in the list) and scroll
you will see a bar in the left margin for the duplicated code. You can click on it and it
states where the duplicate is.
>> 
>> There are 13 reported duplicates and it’s not always a spurious report. Some of
the code blocks are duplicates. In this case it is for code that performs work to compute
more than a single value. As such it is not a simple refactor to move the block into a method
without having to return an object encapsulating the multiple values computed; or pass in
an array to hold all the value computed; or have a common parent with shared state variable.
In the case of common code in the random generator next() method of Well19937a and Well44497a
the various workarounds to eliminate common code would have negative effects and is not worth
it IMO. In one case it is in classes within different packages (e.g. AbstractPcg6432 vs PcgRxsMXs64)
where the amount of duplicate code is actually the same but it is outside of the package structure
to have a common parent. That is unreleased code though so could be rearranged without breaking
binary compatibility.
>> 
>> In about half of the cases it is a spurious report where:
>> 
>> - the code block as text is identical but the types of the variables are different
(e.g. byte/short/int in the MarsagliaTsangWangDiscreteSampler) - in these cases I could try
renaming variables but it seems odd since the code is deliberately copied with modification
only for the data type.
>> - the value of the constants are different on a single line of code that calls a
method (e.g. the AbstractXoRoShiRo128 vs AbstractXoShiRo256).
>> 
>> In one case it is exactly the same algorithm (BoxMullerGaussianSampler vs BoxMullerNormalizedGaussianSampler);
here one is deprecated as it is a known duplicate.
>> 
>>> If so, how to deactivate that spurious report?
>> 
>> Don’t know. I’m currently OK with all the duplicates as they are. Ideally they
could be removed as exceptions without turning off the duplicates report but leaving them
on is fine.
> 
> IIRC it is possible to instruct SonarQube that a particular should be
> considered as solved.
> [I could do it when SonarQube was located at Apache; but one has to be
> logged in (which
> is reasonable for tracking purpose).]
> 
>> 
>> There are a fair number of other code smells (total = 107).
>> 
>> We can get rid of 9 by renaming SEED to S in the commons/rng/simple/internal/ Converters.
> 
> I don't agree with this rule as it contradicts a more important one IMO:
> self-documenting code.
> 
>> There are 33 for deprecation to ignore.
>> 
>> There are 18 for tests that have no assertions. These are tests that check constructors
do not throw when self-seeded (i.e. input seed is too small). They could be updated to assert
the created object is not null. It seems pointless to do that. The methods are commented to
state they are checking construction paths. The generators could be tested to ensure they
do not return all zeros or the same number. But a full statistical test of the output seems
over the top.
>> 
>> There are 15 'Extract the assignment out of this expression’ that I prefer as is
in MarsagliaTsangWangDiscreteSampler. But I did write it. I could change this and it would
not affect sampler performance as the code is all in the sampler creation method. The other
3 are in the generator next() methods and are written that way to match the reference code.
It is a style that I am fine with in these cases. If changed I would like to do performance
tests to check it doesn’t make the method slower to remove a coding style warning. So I
recommend not changing those for simplicity.
> 
> Fine.
> Could you check that you can declare issues as "solved”?

I can sign in with GitHub and it links my Apache credentials. This allows me to comment on
issues and ‘confirm’ them which I could not do when not logged in.

However I do not see options to mark false positives and won’t fix. This stack overflow
thread [1] for SonarCloud 5 states that I need to be added to an "Administer Issues” group
for the project. Maybe the SonarCloud instance is a new version but the same should apply.
I don’t have enough permissions.

Is this an issue to raise with INFRA?

[1] https://stackoverflow.com/questions/29646256/sonarqube-5-how-do-i-mark-false-positive
<https://stackoverflow.com/questions/29646256/sonarqube-5-how-do-i-mark-false-positive>



> 
> Thanks,
> Gilles
> 
>> 
>> 3 for commented out code in the pom.xml could be fixed.
>> 
>> There are 2 ignored tests using zero array seeds. It is known that xor-shift based
generators cannot handle all zero seeds. These tests would never work for those generators
(there are 18 such generators in the library). I suggest deleting these tests. An alternative
would be to test the first 50 values. If all zero then assume the generator is dead. Otherwise
do the statistical test on the generator. The test could be commented to state that if a generator
is outputting non-zero values from an all zero seed that the output should be random.
>> 
>> The TODO comment in InternalGamma can be removed. It is 3 years old and there is
not currently a need to move this helper class to a different component.
>> 
>> Then there are a few legitimate oddities for the remaining 10 or so. I can comment
the code so this can be seen when clicking through from Sonar.
>> 
>> Doing all the fixes would remove 48 code smells. It would make looking at the Sonar
report easier.
>> 
>> Alex
>> 
>> 
>>> 
>>> Regards,
>>> Gilles
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message