systemml-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Niketan Pansare" <npan...@us.ibm.com>
Subject Re: about performance statistics of PCA.dml
Date Fri, 21 Jul 2017 17:58:13 GMT

Hi Janardhan,

You can get instruction-level statistics with the commit
https://github.com/apache/systemml/commit/648eb21d66f9cd8727090cdf950986765a7e6ee8
:
SystemML Statistics:
Total elapsed time:             18.956 sec.
Total compilation time:         1.924 sec.
Total execution time:           17.032 sec.
Number of compiled Spark inst:  3.
Number of executed Spark inst:  0.
Cache hits (Mem, WB, FS, HDFS): 29/0/0/1.
Cache writes (WB, FS, HDFS):    24/0/4.
Cache times (ACQr/m, RLS, EXP): 0.201/0.001/0.007/8.379 sec.
HOP DAGs recompiled (PRED, SB): 0/1.
HOP DAGs recompile time:        0.007 sec.
Spark ctx create time (lazy):   0.949 sec.
Spark trans counts (par,bc,col):0/0/0.
Spark trans times (par,bc,col): 0.000/0.000/0.000 secs.
Total JIT compile time:         4.86 sec.
Total JVM GC count:             7.
Total JVM GC time:              0.192 sec.
Heavy hitter instructions:
  #  Instruction                    Time(s)  Count  Misc Timers
  1  write [PCA.dml 110:8-110:14]     7.628      1
  2  eigen [PCA.dml 85:1-85:1]        6.858      1  rlswr[0.000s,2], rlsev
[0.000s,0], aqmd[0.000s,2]
  3  write [92:12-92:25]              0.689      1
  4  ba+* [PCA.dml 110:8-110:14]      0.500      1  rlswr[0.000s,1], aqmd
[0.000s,1], aqrd[0.000s,2], rlsev[0.000s,0], rlsi[0.001s,2]
  5  tsmm [PCA.dml 81:5-81:16]        0.338      1  rlswr[0.000s,1], rlsev
[0.000s,0], rlsi[0.000s,1], aqrd[0.000s,1], aqmd[0.000s,1]
  6  uacmean [PCA.dml 66:5-66:5]      0.320      1  rlswr[0.000s,1], rlsev
[0.000s,0], aqmd[0.000s,1], rlsi[0.000s,1], aqrd[0.200s,1]
  7  uacsqk+ [PCA.dml 70:23-70:23]    0.177      1  rlswr[0.000s,1], rlsev
[0.000s,0], aqmd[0.000s,1], aqrd[0.000s,1], rlsi[0.000s,1]
  8  ba+* [92:12-92:25]               0.175      1  rlswr[0.000s,1], aqrs
[0.000s,1], aqrd[0.000s,1], rlsev[0.000s,0], aqmd[0.000s,1], rlsi[0.000s,2]
  9  / [PCA.dml 75:16-75:31]          0.088      1  rlswr[0.000s,1], rlsev
[0.000s,0], aqrd[0.000s,2], aqmd[0.000s,1], rlsi[0.000s,2]
 10  - [PCA.dml 67:9-67:13]           0.048      1  rlswr[0.000s,1], rlsev
[0.000s,0], aqmd[0.000s,1], aqrd[0.000s,2], rlsi[0.000s,2]
 11  write [90:11-90:23]              0.044      1
 12  uack+ [PCA.dml 80:6-80:6]        0.036      1  rlswr[0.000s,1], rlsev
[0.000s,0], aqmd[0.000s,1], aqrd[0.000s,1], rlsi[0.000s,1]
 13  uacmean [PCA.dml 72:2-72:2]      0.028      1  rlswr[0.000s,1], rlsev
[0.000s,0], aqrd[0.000s,1], aqmd[0.000s,1], rlsi[0.000s,1]
 14  -* [PCA.dml 81:5-81:22]          0.026      1  rlswr[0.000s,1], rlsev
[0.000s,0], aqmd[0.000s,1], aqrd[0.000s,2], rlsi[0.000s,2]
 15  / [PCA.dml 81:5-81:22]           0.019      1  rlswr[0.000s,1], rlsev
[0.000s,0], aqmd[0.000s,1], aqrd[0.000s,1], rlsi[0.000s,1]
 16  write [102:1-102:1]              0.018      1
 17  tsmm [PCA.dml 81:36-81:46]       0.008      1  rlswr[0.000s,1], rlsev
[0.000s,0], aqrd[0.000s,1], rlsi[0.000s,1], aqmd[0.000s,1]
 18  ctableexpand [88:1-88:1]         0.007      1  rlsev[0.000s,0], rlsi
[0.000s,2], aqms[0.000s,1], aqrd[0.000s,2], rlswr[0.002s,1]
 19  seq [88:17-88:17]                0.004      1  rlswr[0.000s,1], rlsev
[0.000s,0], aqmd[0.000s,1]
 20  ba+* [90:11-90:23]               0.003      1  rlswr[0.000s,1], rlsev
[0.000s,0], aqrd[0.000s,1], rlsi[0.000s,2], aqmd[0.000s,1], aqrs[0.000s,1]
 21  rsort [87:1-87:1]                0.003      1  rlswr[0.000s,1], rlsev
[0.000s,0], aqmd[0.000s,1], rlsi[0.000s,1], aqrd[0.000s,1]
 22  sqrt [PCA.dml 75:20-75:20]       0.002      1  rlswr[0.000s,1], rlsev
[0.000s,0], aqmd[0.000s,1], rlsi[0.000s,1], aqrd[0.000s,1]
 23  !=                               0.001      1
 24  rmvar [-1:-1--1:-1]              0.001     22
 25  ^2 [PCA.dml 73:25-73:30]         0.001      1  rlswr[0.000s,1], rlsev
[0.000s,0], aqmd[0.000s,1], rlsi[0.000s,1], aqrd[0.000s,1]
 26  / [PCA.dml 73:14-73:37]          0.001      1  rlswr[0.000s,1], rlsev
[0.000s,0], aqmd[0.000s,1], aqrd[0.000s,1], rlsi[0.000s,1]
 27  -* [PCA.dml 73:15-73:19]         0.000      1  rlswr[0.000s,1], rlsev
[0.000s,0], aqmd[0.000s,1], rlsi[0.000s,2], aqrd[0.000s,2]
 28  sqrt [102:1-102:1]               0.000      1  rlswr[0.000s,1], rlsev
[0.000s,0], rlsi[0.000s,1], aqrd[0.000s,1], aqmd[0.000s,1]
 29  + [104:28-104:34]                0.000      1
 30  createvar [90:11-90:23]          0.000      1

With initial glance (so please feel free to correct me if I am wrong),
Heavy hitter number 5 corresponds to the expression (t(A) %*% A).
Heavy hitter number 17 corresponds to the expression t(mu) %*% mu.
Heavy hitter number 17 corresponds to the expression (output of instruction
5) / scalar
and so on ...

As an FYI, here are the steps I followed
wget
https://raw.githubusercontent.com/apache/systemml/master/scripts/algorithms/PCA.dml
wget
https://raw.githubusercontent.com/apache/systemml/master/scripts/datagen/genRandData4PCA.dml
wget
https://raw.githubusercontent.com/apache/systemml/master/conf/SystemML-config.xml.template
mv SystemML-config.xml.template SystemML-config.xml
# Set systemml.stats.finegrained to true
# Make sure you do a git pull to get the commit
https://github.com/apache/systemml/commit/648eb21d66f9cd8727090cdf950986765a7e6ee8
~/spark-2.1.0-bin-hadoop2.7/bin/spark-submit --driver-memory 10g
SystemML.jar -f genRandData4PCA.dml -nvargs R=10000 C=1000 F=binary
OUT=pcaData.mtx
~/spark-2.1.0-bin-hadoop2.7/bin/spark-submit --driver-memory 10g
SystemML.jar -f PCA.dml -stats 30 -nvargs INPUT=pcaData.mtx
OUTPUT=pca-1000x1000-model PROJDATA=1 CENTER=1 SCALE=1

Thanks,

Niketan Pansare
IBM Almaden Research Center
E-mail: npansar At us.ibm.com
http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar



From:	Janardhan Pulivarthi <janardhan.pulivarthi@gmail.com>
To:	dev@systemml.apache.org
Date:	07/21/2017 08:57 AM
Subject:	about performance statistics of PCA.dml



Hi Mike,

I'd like to know how much expensive this critical code is

 C = (t(A) %*% A)/(N-1) - (N/(N-1))*t(mu) %*% mu;

(at
https://github.com/apache/systemml/blob/master/scripts/algorithms/PCA.dml#L81
)
in the SPARK setting given

   1. 60Kx700 input for A
   2. For a datasize of 28 MB with 100 continuous variable and 1 column
   with numeric label variable

with reference to this comment.(
https://issues.apache.org/jira/browse/SYSTEMML-831?focusedCommentId=15525147&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15525147

)

Thank you,
Janardhan



Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message