commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luc Maisonobe <>
Subject Re: [math] Performance optimization in the vector classes
Date Mon, 02 Nov 2009 20:43:04 GMT
Jake Mannix a écrit :
> On Mon, Nov 2, 2009 at 7:41 AM, <> wrote:
>> This is the reason why the boolean is there.
> Yeah, I know, my point is that in the current method implementation,
> the boolean is not used properly - it's currently: return new
> ArrayRealVector(v)
> which forces a copy.  return new ArrayRealVector(v, false) fixes this bug.
>>> The difference of doing only one array access inside the loop alone
>>> saved
>>> 30% on profiling tests on my laptop with 1M long double arrays.  I
>>> don't
>>> even want to think about the cost of making superflous array copies as
>>> well.
>> Fine, good catch. Let's implement it.
> I'll open a separate JIRA and patch for it then.
>>> 2) In OpenMapRealVector, any case where two OpenMapRealVectors are
>>> combined
>>> together, iterating over the smaller should be preferred, but this is
>>> never
>>> checked for.
>> I think we discussed about similar things in the last few months, and
>> simply forgot to do it.
> Ditto for this one.
>> In the cast, the check for the class is done internally, and can be
>> optimised by the compiler, particularly when this method is inlined. So this
>> is an attempt to avoid one if and to let the compiler directly call the
>> specific methods for ArrayRealVector which is much faster than the generic
>> one. I understand it is a bit weird.
> It really works this way?  The JIT compiler takes a situation where an
> exception is thrown, and optimizes it away into calling the other methods,
> without doing any stack jumping?  If at runtime, the method only gets passed
> ArrayRealVectors for a while, how does the JIT know that OpenMapRealVector
> won't be called in the future?  It seems that if a runtime configuration had
> a large mixture of calls to this method with different vector
> implementations, there could be no inlining, and instead there would be tons
> of superfluous Exception object creation and stack jumping.  Or am I missing
> how this works?

There are several cases. If you do:

  RealVector v = new ArrayRealVector(...);

and later use v, a static analysis would be sufficient to know v is
really an ArrayRealVector instance.

If you have a more complex code, but at the beginning of the loops the
JVM knows it has only loaded one class implementing RealVector, then the
compiler will use that information when optimising the native code it
creates at run time. If later a new implementation is loaded, then the
JVM invalidates this code and redo the compilation using different
assumptions for optimisation. This is one of the great things with
dynamic compilation and optimisation. It adjusts itself according to how
the code is used. Modern optimizers are really smart.

>> 4) ArrayRealVector#unitVector() calls norm() once, and then it calls
>>> it
>>> again.  That's clearly unnecessary.
>> You are right. the lat line should simply use the norm local variable which
>> in fact was introduced just for this reason.
> JIRA and patch, coming up.



>>> Also, why the special case for
>>> ArithmeticExcption here, for the zero norm case?  Why not just let
>>> java just
>>> try to divide, and if it divides by zero, well, it'll throw an
>>> ArithmeticException itself...  seems like the whole method should just
>>> be
>>> implemented as " return mapDivide(norm()); "
>> No. A division by zero does not lead to ArithmeticException when dealing
>> with double variables, it leads to a NaN being produced. ArithmeticException
>> is triggered when an integer division by zero occurs.
> Duh of course, right.
>> We really prefer separate patches as we commit them separately.
>> This allows having a clear history in the svn repository.
> Fair enough.
>   -jake

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message