From dev-return-1927-apmail-systemml-dev-archive=systemml.apache.org@systemml.apache.org Mon Jul 17 18:22:18 2017 Return-Path: X-Original-To: apmail-systemml-dev-archive@minotaur.apache.org Delivered-To: apmail-systemml-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 37FB219E4F for ; Mon, 17 Jul 2017 18:22:18 +0000 (UTC) Received: (qmail 86869 invoked by uid 500); 17 Jul 2017 18:22:18 -0000 Delivered-To: apmail-systemml-dev-archive@systemml.apache.org Received: (qmail 86824 invoked by uid 500); 17 Jul 2017 18:22:18 -0000 Mailing-List: contact dev-help@systemml.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@systemml.apache.org Delivered-To: mailing list dev@systemml.apache.org Received: (qmail 86812 invoked by uid 99); 17 Jul 2017 18:22:17 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Jul 2017 18:22:17 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 45C01C02CF for ; Mon, 17 Jul 2017 18:22:17 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.335 X-Spam-Level: *** X-Spam-Status: No, score=3.335 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FORGED_HOTMAIL_RCVD2=1.187, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=hotmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id GwnOj8jWHU-A for ; Mon, 17 Jul 2017 18:22:14 +0000 (UTC) Received: from APC01-PU1-obe.outbound.protection.outlook.com (mail-oln040092254016.outbound.protection.outlook.com [40.92.254.16]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 8CB215FDEE for ; Mon, 17 Jul 2017 18:22:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hotmail.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=10PUp2aX8RTvsTG8OsuVrXSbMqneD6snTqReYcQQcO4=; b=iExay6SIHzEdD3/kylsxIy8YdyOQD4m+sazdbXtI/MBhlHUja3DkmeUo5IsxMV8U3HDfBzxdDYs01JHsrkvBu4pDvUAJKKsfuwg7DJRnH6KQ4tbXQJY6E/rRuLva6posfFX3IbgXdXn3lm1io+Zf66MXioAl6vGNSLJxhvM+3pT58xkdNz5MSXPyVyHcYVl+tB+StwpsVyvYKz28ZvVS4zags64YDhlDVGiLrStZRVZ5sh1uwyyjaO8X12ZK2PPpPdcGfugt+Ggs8cLFlvAbL+HwxX6mtFlCSWqYsVzq+yMtJTs22t1sqlj5ky6z6y3JaVBEY0IwCAs+xukAD7fE6g== Received: from PU1APC01FT038.eop-APC01.prod.protection.outlook.com (10.152.252.53) by PU1APC01HT062.eop-APC01.prod.protection.outlook.com (10.152.253.95) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.1220.9; Mon, 17 Jul 2017 18:22:03 +0000 Received: from MAXPR01MB0220.INDPRD01.PROD.OUTLOOK.COM (10.152.252.53) by PU1APC01FT038.mail.protection.outlook.com (10.152.253.136) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1220.9 via Frontend Transport; Mon, 17 Jul 2017 18:22:02 +0000 Received: from MAXPR01MB0220.INDPRD01.PROD.OUTLOOK.COM ([10.164.149.150]) by MAXPR01MB0220.INDPRD01.PROD.OUTLOOK.COM ([10.164.149.150]) with mapi id 15.01.1261.022; Mon, 17 Jul 2017 18:22:01 +0000 From: arijit chakraborty To: "dev@systemml.apache.org" Subject: Re: Decaying performance of SystemML Thread-Topic: Decaying performance of SystemML Thread-Index: AQHS/nOpmfIiazsaAkqgIh2HXQzjkKJYU4Ac Date: Mon, 17 Jul 2017 18:22:00 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: systemml.apache.org; dkim=none (message not signed) header.d=none;systemml.apache.org; dmarc=none action=none header.from=hotmail.com; x-incomingtopheadermarker: OriginalChecksum:39E09F2ABC964EE9BB1D9FA57B3962AB50D86AF6EC448AF4BAD333FA3314CD1B;UpperCasedChecksum:17748F4202931EE3FADE7B44B5B50BE5AF20AE2096F25CA427707B46842B3AFA;SizeAsReceived:7209;Count:45 x-ms-exchange-messagesentrepresentingtype: 1 x-tmn: [1mTz3Wvt5kD/YCVSjNqFcvEhhn2YHdf4] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;PU1APC01HT062;7:tIdPg8ZClrEqIR3p3FzZ4F0RiLX9yMscGxx6/TbDAdTTavSKZgPR6fyAdAYR1ozvjmj2vrnG/I3BvyGWLWwvMMVHYmBhNAr9xGEgl+1GcxqT5C64MDDU0vIHJIcynZMIGO8e08exTw+EBWmgwB8S8lqIkacGx5uHU0dZs9ydRxt+MK7JjQUQVDa4tLE0M1HkTCNkQ9RK4k77KpxU/UgJ66XEJTMLGZFLg1Z0betPuCT8bn3eKalegTyFwO6iwJCo3Oqerl7e9XiSplNN6jTbsqOo+5RPHsivaZaB8oJ63QZUVAZtGCA3uWu4gjFTQuwXbXAG/xtmmRJmchkYVoE+UEVBowxGgAU+yzpFfKChrEi1nfW/AZfAVYKJMgoFOBLk92ftIjrNDsyZiSgZpZMNFyM9u05hFNVz5nECgA4yUDnPId2mexNBJPiSj7EElpHgkGJhaIxqeXcUvzRqWz8NZdcWf2xAH/k8Oqx1udG0BOuN/fyq+573Jhv5eILhenXcYJ9e4XIYORNOA9R+7rlFQdNi5OR7lkdOwPei8dX8wD1Apq8cfnb7gtrfGvg5/jJBN8LkiHhBc/CN/3wG/tM+fEZwOkp+7LzO1o9TJWRwouPwjZ2pxYQqt9gigiN2qDUriuaRa2b+efG9hjK0cXBBfYhYf0M6/DLUmmcYmfidMBuA9J+PGuKkbPO/ZpMshFk7uleoJ/Zc6woaZPJX6Y1hS6YW1QCOQ52aL1BTj+1fDTWoE71ia/YGaROj7euLn2GuAWvfzzm44NVUeclzcYHoVw== x-incomingheadercount: 45 x-eopattributedmessage: 0 x-forefront-antispam-report: EFV:NLI;SFV:NSPM;SFS:(7070007)(98901004);DIR:OUT;SFP:1901;SCL:1;SRVR:PU1APC01HT062;H:MAXPR01MB0220.INDPRD01.PROD.OUTLOOK.COM;FPR:;SPF:None;LANG:en; x-ms-office365-filtering-correlation-id: d2d508ca-d186-456d-b682-08d4cd40b83e x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(300000500095)(300135000095)(300000501095)(300135300095)(22001)(300000502095)(300135100095)(300000503095)(300135400095)(201702061074)(5061506573)(5061507331)(1603103135)(2017031320274)(2017031324274)(2017031323274)(2017031322350)(1603101448)(1601125374)(1701031045)(300000504095)(300135200095)(300000505095)(300135600095)(300000506095)(300135500095);SRVR:PU1APC01HT062; x-ms-traffictypediagnostic: PU1APC01HT062: x-exchange-antispam-report-test: UriScan:(190756311086443)(158342451672863)(26388249023172)(236129657087228)(48057245064654)(194151415913766)(8415204561270)(209349559609743)(247924648384137); x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(444000031);SRVR:PU1APC01HT062;BCL:0;PCL:0;RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095);SRVR:PU1APC01HT062; x-forefront-prvs: 0371762FE7 spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: multipart/alternative; boundary="_000_MAXPR01MB0220C855BD97611D8CDBF818A3A00MAXPR01MB0220INDP_" MIME-Version: 1.0 X-OriginatorOrg: hotmail.com X-MS-Exchange-CrossTenant-originalarrivaltime: 17 Jul 2017 18:22:00.8171 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Internet X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-Transport-CrossTenantHeadersStamped: PU1APC01HT062 --_000_MAXPR01MB0220C855BD97611D8CDBF818A3A00MAXPR01MB0220INDP_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Thanks Matthias for your answers! I'll look into proper allocation of the m= emory issue. Maybe you rightly pointed out, I'm making mistake there. I'm m= ore of a data scientist than a programmer. So I generally don't tinker with= existing setup and so have less knowledge about these things. I'll read up= about spark configuration and try to fix this part. Regarding "Recompile time", I didn't make any changes there. I created spar= k folder and importing systemML using "pip" statement. I also put systemML = jar file in the spark jar folder. So I'm not sure why recompile time is tak= ing so long. These are the only tinkering I did with systemML setup. I'm al= so calling all spark configuration through jupyter interface as I don't wan= t to accidentally make any mistake and also it's much easier. If you could = guide me where I might make mistake then it would be of great help. Thank you again for all your help! Regards, Arijit ________________________________ From: Matthias Boehm Sent: Monday, July 17, 2017 2:09:47 AM To: dev@systemml.apache.org Subject: Re: Decaying performance of SystemML thanks for sharing the skeleton of the script. Here are a couple of suggestions: 1) Impact of fine-grained stats: The provided script executes mostly scalar instructions. In those kinds of scenarios, the time measurement per instruction can be a major performance bottleneck. I just executed this script with and without -stats and got end-to-end execution times of 132s and 54s respectively, which confirms this. 2) Memory budget: You allocate a vector of 100M, i.e., 800MB - the fact that the stats output shows spark instructions means that you're running the driver with very small memory (maybe the default of 1GB?). When comparing with R please ensure that both have the same memory budget. On large data, we would compile distributed operations but of course you only benefit from that if you have a cluster - right now you're running in Spark local mode only. 3) Recompile time: Another thing that looks suspicious to me is the recompilation time of 15.529s for 4 recompilations. Typically, we see <1ms recompilation times per average DAG of 50-100 operators - could it be that there are some setup issues which lazily load classes and libraries? Regards, Matthias On Sun, Jul 16, 2017 at 8:31 AM, arijit chakraborty wrote: > Hi Matthias, > > > I was trying the following code in both R and systemML. The difference in > speed is huge, in computational term. > > R time: 1.837146 mins > SystemML Time: Wall time: 4min 33s > > The code I'm working on is very similar to this code. The only difference > is I'm doing lot more computation within these 2 while-loops. > > Can you help me understand why I'm getting this difference. My > understanding was with larger datasize, the systemML performance should b= e > far better than R performance. In smaller datasize their performances are > almost the same. > > The code has been tested in the same system. The spark configuration is > the following. > > > import os > import sys > import pandas as pd > import numpy as np > > spark_path =3D "C:\spark" > os.environ['SPARK_HOME'] =3D spark_path > os.environ['HADOOP_HOME'] =3D spark_path > > sys.path.append(spark_path + "/bin") > sys.path.append(spark_path + "/python") > sys.path.append(spark_path + "/python/pyspark/") > sys.path.append(spark_path + "/python/lib") > sys.path.append(spark_path + "/python/lib/pyspark.zip") > sys.path.append(spark_path + "/python/lib/py4j-0.10.4-src.zip") > > from pyspark import SparkContext > from pyspark import SparkConf > > sc =3D SparkContext("local[*]", "test") > > > # SystemML Specifications: > > > from pyspark.sql import SQLContext > import systemml as sml > sqlCtx =3D SQLContext(sc) > ml =3D sml.MLContext(sc) > > > > The code we tested: > > > a =3D matrix(seq(1, 100000000, 1), 1 , 100000000) > > b =3D 2 > > break_cond_1 =3D 0 > while(break_cond_1 =3D=3D 0 ){ > break_cond_2 =3D 0 > while(break_cond_2 =3D=3D 0 ){ > > ## Checking if atleast 10 numbers are there in the data-points which > is even > c =3D 0 > for(i in 1:ncol(a)){ > > if( i %% 2 =3D=3D 0){ > c =3D c + 1 > } > > } > #c =3D c + 2 > if( c > 1000){ > > break_cond_2 =3D 1 > }else{ > > c =3D c + 2 > > > } > > } > > if(break_cond_2 =3D=3D 1){ > break_cond_1 =3D 1 > }else{ > > c =3D c + 2 > } > > > > } > > Please find some more systemML information below: > > SystemML Statistics: > Total elapsed time: 0.000 sec. > Total compilation time: 0.000 sec. > Total execution time: 0.000 sec. > Number of compiled Spark inst: 5. > Number of executed Spark inst: 5. > Cache hits (Mem, WB, FS, HDFS): 3/0/0/0. > Cache writes (WB, FS, HDFS): 6/0/0. > Cache times (ACQr/m, RLS, EXP): 0.000/0.001/0.004/0.000 sec. > HOP DAGs recompiled (PRED, SB): 0/4. > HOP DAGs recompile time: 15.529 sec. > Spark ctx create time (lazy): 0.091 sec. > Spark trans counts (par,bc,col):0/0/0. > Spark trans times (par,bc,col): 0.000/0.000/0.000 secs. > Total JIT compile time: 0.232 sec. > Total JVM GC count: 5467. > Total JVM GC time: 8.237 sec. > Heavy hitter instructions (name, time, count): > -- 1) %% 33.235 sec 100300000 > -- 2) rmvar 27.762 sec 250750035 > -- 3) =3D=3D 26.179 sec 100300017 > -- 4) + 15.555 sec 50150000 > -- 5) assignvar 6.611 sec 50150018 > -- 6) sp_seq 0.675 sec 1 > -- 7) sp_rshape 0.070 sec 1 > -- 8) sp_chkpoint 0.017 sec 3 > -- 9) seq 0.014 sec 3 > -- 10) rshape 0.003 sec 3 > > > > > > > Thank you! > > Arijit > > > ________________________________ > From: arijit chakraborty > Sent: Wednesday, July 12, 2017 12:21:43 AM > To: dev@systemml.apache.org > Subject: Re: Decaying performance of SystemML > > Thank you Matthias! I'll follow your suggestions. Regarding TB, I had thi= s > confusion that "g" implies 512 mb. That's why I kept around 2TB memory. > > > Thanks again! > > Arijit > > ________________________________ > From: Matthias Boehm > Sent: Tuesday, July 11, 2017 10:42:58 PM > To: dev@systemml.apache.org > Subject: Re: Decaying performance of SystemML > > without any specifics of scripts or datasets, it's unfortunately, hard > if not impossible to help you here. However, note that the memory > configuration seems wrong. Why would you configure the driver and > executors with 2TB if you only have 256GB per node. Maybe you observe an > issue of swapping. Also note that the maxResultSize is irrelevant in > case SystemML creates the spark context because we would anyway set it > to unlimited. > > Regarding generally recommend configurations, it's usually a good idea > to use one executor per worker node with the number of cores set to the > number of virtual cores. This allows maximum sharing of broadcasts > across tasks and hence reduces memory pressure. > > Regards, > Matthias > > On 7/11/2017 9:36 AM, arijit chakraborty wrote: > > Hi, > > > > > > I'm creating a process using systemML. But after certain period of time= , > the performance decreases. > > > > > > 1) This warning message: WARN TaskSetManager: Stage 25254 contains a > task of very large size (3954 KB). The maximum recommended task size is 1= 00 > KB. > > > > > > 2) For Spark, we are implementing this setting: > > > > spark.executor.memory 2048g > > > > spark.driver.memory 2048g > > > > spark.driver.maxResultSize 2048 > > > > is this good enough, or we can do something else to improve the > performance? WE tried the spark implementation suggested in the > documentation. But it didn't help much. > > > > > > 3) We are running on a system with 244 gb ram 32 cores and 100 gb hard > disk space. > > > > > > it will be great if anyone can guide me how to improve the performance. > > > > > > Thank you! > > > > Arijit > > > --_000_MAXPR01MB0220C855BD97611D8CDBF818A3A00MAXPR01MB0220INDP_--