absolutely, if your want to count or aggregate values of the two groups,
you should definitely go with the aggregate() call instead. The snippets I
provided are just for the case where you want to run some other analysis
over the subsets (e.g., running an algorithm over a sample or fold).
Regards,
Matthias
From: Ethan Xu <ethan.yifanxu@gmail.com>
To: dev@systemml.incubator.apache.org
Date: 03/31/2016 11:31 AM
Subject: Re: Logical indexing?
Ah I missed the 'removeEmpty()' function. That's a smart ways to trim
matrix. Thanks Matthias!
Also from your answer I realized 'ind = (X[,1] > 10);' is acceptable, so
aggregation would work with
ind = (X[,1] > 10) + 1;
F = aggregate(target = X[,2], groups = ind, fn = "sum");
Ethan
On Thu, Mar 31, 2016 at 1:22 PM, Matthias Boehm <mboehm@us.ibm.com> wrote:
> just a quick correction of option 2:
>
> Ind = (X[,1]>10);
> Y = removeEmpty(target=X, select=Ind);
>
> Regards,
> Matthias
>
> [image: Inactive hide details for Matthias Boehm03/31/2016 10:14:50
> AMthat's a good question  no SystemML does not support set i]Matthias
> Boehm03/31/2016 10:14:50 AMthat's a good question  no SystemML
does
> not support set indexing yet but you can emulate it via pe
>
> From: Matthias Boehm/Almaden/IBM@IBMUS
> To: dev@systemml.incubator.apache.org
> Date: 03/31/2016 10:14 AM
> Subject: Re: Logical indexing?
> 
>
>
>
> that's a good question  no SystemML does not support set indexing yet
but
> you can emulate it via permutation matrices or similar transformations.
> Here are some examples:
>
> # option 1: via permutation (aka selection) matrices
> P = removeEmpty(target=diag(X[,1]>10), margin="rows");
> Y = P %*% X;
>
> # option 2: via removeEmpty
> Ind = diag(X[,1]>10);
> Y = removeEmpty(target=X, select=Ind);
>
>
> Regards,
> Matthias
>
> Ethan Xu 03/31/2016 08:47:43 AMDoes SystemML support logical
> indexing? For example if X is a numerical matrix with 2 columns and n
>
> From: Ethan Xu <ethan.yifanxu@gmail.com>
> To: dev@systemml.incubator.apache.org
> Date: 03/31/2016 08:47 AM
> Subject: Logical indexing?
> 
>
>
>
> Does SystemML support logical indexing?
>
> For example if X is a numerical matrix with 2 columns and n rows (in my
> case n ~ 35 million). I'd like to split the matrix rowwise according to
> values of the first column. This is useful when I need to find
> distributions of subgroups of population. In R I can do
>
> Y = X[ X[ ,1] > 10, ]
>
> OR
>
> ind = which(X[ ,1] > 10)
> Y = X[ind, ]
>
> It seems neither syntex works in SystemML.
>
> I noticed there's an aggregate() function for SystemML, but it supports
> coded categorical variable.
>
> Perhaps one way to do that is creating an indicator n by 1 matrix Z that
> takes values 1 and 2 where 1 corresponds to X[, 1] <= 10 and 2
corresponds
> to X[,1] > 10. Then aggregate() X[,2] with respect to Z.
>
> It seems transform() with 'bin' option is one obvious way to create such
a
> Z, however the 'bin' method only supports 'equiwidth' currently.
>
> Is looping through X[,1] the best option? Maybe I missed some other
> convenient functions.
>
> Any suggestions are greatly appreciated!
>
> Best,
>
> Ethan
>
>
>
>
