systemml-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthias Boehm <mboe...@googlemail.com>
Subject Re: Matrix non-range indexing should return a scalar
Date Sat, 29 Jul 2017 03:40:20 GMT
Thanks for bringing this up Mike - this is a useful discussion. Let us
first clarify the R semantics. In R, any scalar/vector matrix indexing
gives a numeric vector because there are no scalars but vectors of length
1. So Nakul, R does not behave like this proposal. If I remember correctly,
Matlab also represents scalars as 1x1 matrices. We discussed a similar
approach for SystemML many years ago but decided against it for performance
reasons and to support scalars of value types other than double.

>From my perspective, the proposed approach would lead to inconsistent
behavior because the data type of 1x1 output matrices would either be
scalar or matrix, depending on subtle specification differences. Right now
our indexing semantics are very clear: any indexing returns a matrix and
X[1,2] is a short-hand for X[1:1,2:2] and X[1,] is a short-hand for
X[1,1:ncol(X)]. For example, why should X[1,2] evaluate to a scalar but
X[1:1,2:2] and X[rl:ru,cl:cu] (with rl=ru and cl=cu) evaluate to a matrix,
just because the index ranges are explicitly specified and potentially
unknown during compilation?

In contrast to interpreters for dynamically typed languages, it's
unfortunately also not possible to decide the data type at runtime in order
to achieve consistency. This is because we compile data-type-specific
instructions for operations on the output ahead of time. For example, A *
X[i,] would be compiled to a matrix-matrix arithmetic instruction - so we
could not return a scalar for X[i,] even if X is a column vector. Even a
generalization to generic, data-type-oblivious instructions would not solve
the limitation because other operations are simply not defined over
scalars.

The bottom line is, I would prefer to keep our currently clean indexing
semantics because the special case handling of X[1,2] might lead to more
confusion. However, if I'm outvoted, it could be easily introduced by
injecting an as.scalar(X[1,2]) during parsing, which would at least keep
these inconsistencies out of the compiler and runtime stack.

Regards,
Matthias

On Fri, Jul 28, 2017 at 5:58 PM, Imran Younus <imranyounus@gmail.com> wrote:

> +1
>
> Numpy also behaves the way Mike is suggesting here.
>
> imran
>
> On Fri, Jul 28, 2017 at 5:21 PM, Nakul Jindal <nakul02@gmail.com> wrote:
>
> > +1 to Mike & Deron.
> >
> > Two other languages/packages that behave like this:
> > R : http://www.r-tutor.com/r-introduction/matrix
> > Octave :
> > https://www.gnu.org/software/octave/doc/interpreter/Index-
> Expressions.html
> >
> >
> > -Nakul
> >
> >
> >
> >
> > On Fri, Jul 28, 2017 at 4:03 PM, Deron Eriksson <deroneriksson@gmail.com
> >
> > wrote:
> >
> > > Thank you Mike for bringing this up. To me, this definitely makes sense
> > at
> > > the user (DML) level.
> > >
> > > For a Java-style pseudocode example, currently we require the user to
> do
> > > the following:
> > >   int[][] m = int[][]{1,2,3,4};
> > >   int[][] n = m[0][0];
> > >   int x = (int) n;
> > >
> > > I feel the following would be more 'natural':
> > >   int[][] m = int[][]{1,2,3,4};
> > >   int x = m[0][0];
> > >
> > > If a user asks for a specific cell (and not a range) in DML code, I
> think
> > > the user clearly wants a value and not a matrix that the user needs to
> > cast
> > > via as.scalar.
> > >
> > > Deron
> > >
> > >
> > >
> > > On Fri, Jul 28, 2017 at 3:41 PM, <dusenberrymw@gmail.com> wrote:
> > >
> > > > Currently, non-range matrix indexing, such as `X[1,2]`, returns a 1x1
> > > > matrix in SystemML rather than a single scalar value.  This is
> > > inconsistent
> > > > with mathematical semantics, and with array indexing semantics of any
> > > major
> > > > language, thus leading to confusion for users.
> > > >
> > > > I would like to propose that non-range indexing at the language
> level,
> > > > such as `X[1,2]`, should return a single scalar value, and range
> > indexing
> > > > of any kind at the language level, including the trivial example
> > > > `X[1:1,2:2]`, should return a matrix.  This would lead to clear
> > semantics
> > > > that are consistent with mathematics and language array indexing,
> thus
> > > > preventing user confusion.  Additionally, these are the semantics
> that
> > > the
> > > > NumPy project uses.
> > > >
> > > > Interested to hear thoughts from the rest of the community!
> > > >
> > > > -Mike
> > > >
> > > > --
> > > >
> > > > Mike Dusenberry
> > > > GitHub: github.com/dusenberrymw
> > > > LinkedIn: linkedin.com/in/mikedusenberry
> > > >
> > > > Sent from my iPhone.
> > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message