commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phil Steitz <>
Subject Re: [math] RealLinearOperator and AbstractRealMatrix
Date Thu, 14 Jul 2011 16:37:17 GMT
On 7/14/11 8:39 AM, Phil Steitz wrote:
> On 7/14/11 3:49 AM, Gilles Sadowski wrote:
>> On Wed, Jul 13, 2011 at 03:01:09PM -0700, Ted Dunning wrote:
>>> Actually, this is a major issue.
>> Indeed, it is an important design issue.
>>> Take, for instance, the example of considering a Lucene index as a linear
>>> operator.  The number of rows is the number of documents (which is changing
>>> as documents are added) and the number of columns is the number of unique
>>> terms (which is also changing as documents are added).
>>> Matrix multiplication consists of translating a (hopefully sparse) vector
>>> into a query, running the query and interpreting the result as a (sparse)
>>> vector.
>>> This is just one of a host of similar application oriented definitions of a
>>> LinearOperator, none of which define the number of rows and columns at
>>> construction time.
>>> I can still do a singular value decomposition of this linear operator
>> Eventually it seems to come down to a programming style issue. What you
>> expose above (the decomposition algorithm) uses _fixed_ dimensions; when you
>> need another object (with different dimensions) you'll create a new one.
> That is a serious constraint that you are proposing that we impose
> at the top level of the hiearchy and force on all implementations. 
> Ted's example indicates that in at least one potential practical use
> case, this constraint would be problematic.   Reconstructing and
> reestablishing state for what may be a complex object may not be
> practical.  The decision that you are advocating is that all
> supported linear operators and matrices must be immutable wrt
> dimensions.  What happens when we try to implement an operator such
> as Ted described?  Currently, we do not have that constraint, even
> for matrices.  Why impose it on matrices now and extend it further
> to linear operators?  Moreoever, why force all real matrices and
> operators to maintain the (unecessary) fields in the superclass? 

This may belong on a separate thread, but in looking at options for
refactoring the multiple regression API, I have come across another
example where it would be convenient to be able to add rows to a
RealMatrix.  Currently, none of the implementations support this,
but nothing in the currently defined RealMatrix / AbstractRealMatrix
API prevents subclasses from adding an addRows method, or such a
method being added to Array2DRowRealMatrix.  The latter would be
convenient for the in-memory multiple regression (and other) stats
classes.  Without such a method, the OLS/GLSMultipleLinearRegression
classes will have to either move away from using a RealMatrix
instance to store the design matrix (maintaining double[][] arrays
internally instead and constructing RealMatrix instances just for
computation) or do lots of unnecessary copying of data to replace
matrices when (blocks of) rows are added to the design.  The
unnecessary copying or "back door" access to the underlying data
array is what is forced by insisting that matrices be
dimension-immutable.  If we decide to extend the in-memory
regression classes to support addObservations, I will start a
separate thread to discuss matrix support options.  The point of
mentioning it here is that it is another illustration of why forcing
dimension-immutability on matrices may not be a good idea.

> Phil
>> Both approaches are possible for your use case. But what I think should be
>> avoided is several programming styles within a single library.
>> The abstract accessor style is uncommon in CM and in the "linear" package,
>> the implementations use this style for no good reason (as I've indicated in
>> my previous mail).
>> Regards,
>> Gilles
>>> On Wed, Jul 13, 2011 at 1:52 PM, Phil Steitz <> wrote:
>>>> This is not a huge issue, though, so I can live with it if you feel
>>>> strongly that the fields should be persisted at the top level.
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message