spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jörn Franke <jornfra...@gmail.com>
Subject Re: Apparent memory leak involving count
Date Thu, 09 Mar 2017 14:36:24 GMT
You seem to generate always a new rdd instead of reusing the existing. So I does not seem surprising
that the memory need is growing.

> On 9 Mar 2017, at 15:24, Facundo Domínguez <facundominguez@gmail.com> wrote:
> 
> Hello,
> 
> Some heap profiling shows that memory grows under a TaskMetrics class.
> Thousands of live hashmap entries are accumulated.
> Would it be possible to disable collection of metrics? I've been
> looking for settings to disable it but nothing relevant seems to come
> up.
> 
> Thanks,
> Facundo
> 
> On Wed, Mar 8, 2017 at 2:02 PM, Facundo Domínguez
> <facundominguez@gmail.com> wrote:
>> Hello,
>> 
>> I'm running JavaRDD.count() repeteadly on a small RDD, and it seems to
>> increase the size of the Java heap over time until the default limit
>> is reached and an OutOfMemoryException is thrown. I'd expect this
>> program to run in constant space, and the problem carries over to some
>> more complicated tests I need to get working.
>> 
>> My spark version is 2.1.0 and I'm running this using nix in debian jessie.
>> 
>> Is there anything elemental that I could do to keep memory bounded?
>> 
>> I'm copying the program below and an example of the output.
>> 
>> Thanks in advance,
>> Facundo
>> 
>> /* Leak.java */
>> import java.util.*;
>> import java.nio.charset.StandardCharsets;
>> import java.nio.file.Files;
>> import java.nio.file.Paths;
>> import java.io.IOException;
>> import java.io.Serializable;
>> import org.apache.spark.api.java.*;
>> import org.apache.spark.SparkConf;
>> import org.apache.spark.api.java.function.*;
>> import org.apache.spark.sql.*;
>> 
>> public class Leak {
>> 
>>  public static void main(String[] args) throws IOException {
>> 
>>    SparkConf conf = new SparkConf().setAppName("Leak");
>>    JavaSparkContext sc = new JavaSparkContext(conf);
>>    SQLContext sqlc = new SQLContext(sc);
>> 
>>    for(int i=0;i<50;i++) {
>>      System.gc();
>>      long mem = Runtime.getRuntime().totalMemory();
>>      System.out.println("java total memory: " + mem);
>>      for(String s :
>> Files.readAllLines(Paths.get("/proc/self/status"),
>> StandardCharsets.UTF_8)) {
>>          if (0 <= s.indexOf("VmRSS"))
>>            System.out.println(s);
>>      }
>>      for(int j=0;j<2999;j++) {
>>        JavaRDD<Double> rdd = sc.parallelize(Arrays.asList(1.0,2.0,3.0));
>>        rdd.count();
>>      }
>>    }
>>    sc.stop();
>>  }
>> }
>> 
>> # example output
>> $ spark-submit --master local[1] --class Leak leak/build/libs/leak.jar
>> 17/03/08 11:26:37 WARN NativeCodeLoader: Unable to load native-hadoop
>> library for your platform... using builtin-java classes where
>> applicable
>> 17/03/08 11:26:37 WARN Utils: Your hostname, fd-tweag resolves to a
>> loopback address: 127.0.0.1; using 192.168.1.42 instead (on interface
>> wlan0)
>> 17/03/08 11:26:37 WARN Utils: Set SPARK_LOCAL_IP if you need to bind
>> to another address
>> java total memory: 211288064
>> VmRSS:  200488 kB
>> java total memory: 456654848
>> VmRSS:  656472 kB
>> java total memory: 562036736
>> VmRSS:  677156 kB
>> java total memory: 562561024
>> VmRSS:  689424 kB
>> java total memory: 562561024
>> VmRSS:  701760 kB
>> java total memory: 562561024
>> VmRSS:  732540 kB
>> java total memory: 562561024
>> VmRSS:  748468 kB
>> java total memory: 562036736
>> VmRSS:  770680 kB
>> java total memory: 705691648
>> VmRSS:  789632 kB
>> java total memory: 706740224
>> VmRSS:  802720 kB
>> java total memory: 704118784
>> VmRSS:  832740 kB
>> java total memory: 705691648
>> VmRSS:  850808 kB
>> java total memory: 704118784
>> VmRSS:  875232 kB
>> java total memory: 705691648
>> VmRSS:  898716 kB
>> java total memory: 701497344
>> VmRSS:  919388 kB
>> java total memory: 905445376
>> VmRSS:  942628 kB
>> java total memory: 904921088
>> VmRSS:  989176 kB
>> java total memory: 901251072
>> VmRSS:  999540 kB
>> java total memory: 902823936
>> VmRSS: 1027212 kB
>> java total memory: 903348224
>> VmRSS: 1057668 kB
>> java total memory: 902299648
>> VmRSS: 1070976 kB
>> java total memory: 904396800
>> VmRSS: 1094640 kB
>> java total memory: 897056768
>> VmRSS: 1114612 kB
>> java total memory: 903872512
>> VmRSS: 1142324 kB
>> java total memory: 1050148864
>> VmRSS: 1147836 kB
>> java total memory: 1061158912
>> VmRSS: 1183668 kB
>> java total memory: 1052246016
>> VmRSS: 1211496 kB
>> java total memory: 1058013184
>> VmRSS: 1230696 kB
>> java total memory: 1059061760
>> VmRSS: 1259428 kB
>> java total memory: 1060634624
>> VmRSS: 1284252 kB
>> java total memory: 1055916032
>> VmRSS: 1319460 kB
>> java total memory: 1052246016
>> VmRSS: 1323044 kB
>> java total memory: 1052246016
>> VmRSS: 1323572 kB
>> java total memory: 1052246016
>> VmRSS: 1323836 kB
>> java total memory: 1052246016
>> VmRSS: 1323836 kB
>> java total memory: 1052246016
>> VmRSS: 1324096 kB
>> java total memory: 1052246016
>> VmRSS: 1324096 kB
>> java total memory: 1052246016
>> VmRSS: 1324096 kB
>> ...
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
> 

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message