spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <so...@cloudera.com>
Subject Re: Apparent memory leak involving count
Date Thu, 09 Mar 2017 14:46:55 GMT
The driver keeps metrics on everything that has executed. This is how it
can display the history in the UI. It's normal for the bookkeeping to keep
growing because it's recording every job. You can configure it to keep
records about fewer jobs. But thousands of entries isn't exactly big.

On Thu, Mar 9, 2017 at 2:24 PM Facundo Domínguez <facundominguez@gmail.com>
wrote:

> Hello,
>
> Some heap profiling shows that memory grows under a TaskMetrics class.
> Thousands of live hashmap entries are accumulated.
> Would it be possible to disable collection of metrics? I've been
> looking for settings to disable it but nothing relevant seems to come
> up.
>
> Thanks,
> Facundo
>
> On Wed, Mar 8, 2017 at 2:02 PM, Facundo Domínguez
> <facundominguez@gmail.com> wrote:
> > Hello,
> >
> > I'm running JavaRDD.count() repeteadly on a small RDD, and it seems to
> > increase the size of the Java heap over time until the default limit
> > is reached and an OutOfMemoryException is thrown. I'd expect this
> > program to run in constant space, and the problem carries over to some
> > more complicated tests I need to get working.
> >
> > My spark version is 2.1.0 and I'm running this using nix in debian
> jessie.
> >
> > Is there anything elemental that I could do to keep memory bounded?
> >
> > I'm copying the program below and an example of the output.
> >
> > Thanks in advance,
> > Facundo
> >
> > /* Leak.java */
> > import java.util.*;
> > import java.nio.charset.StandardCharsets;
> > import java.nio.file.Files;
> > import java.nio.file.Paths;
> > import java.io.IOException;
> > import java.io.Serializable;
> > import org.apache.spark.api.java.*;
> > import org.apache.spark.SparkConf;
> > import org.apache.spark.api.java.function.*;
> > import org.apache.spark.sql.*;
> >
> > public class Leak {
> >
> >   public static void main(String[] args) throws IOException {
> >
> >     SparkConf conf = new SparkConf().setAppName("Leak");
> >     JavaSparkContext sc = new JavaSparkContext(conf);
> >     SQLContext sqlc = new SQLContext(sc);
> >
> >     for(int i=0;i<50;i++) {
> >       System.gc();
> >       long mem = Runtime.getRuntime().totalMemory();
> >       System.out.println("java total memory: " + mem);
> >       for(String s :
> > Files.readAllLines(Paths.get("/proc/self/status"),
> > StandardCharsets.UTF_8)) {
> >           if (0 <= s.indexOf("VmRSS"))
> >             System.out.println(s);
> >       }
> >       for(int j=0;j<2999;j++) {
> >         JavaRDD<Double> rdd = sc.parallelize(Arrays.asList(1.0,2.0,3.0));
> >         rdd.count();
> >       }
> >     }
> >     sc.stop();
> >   }
> > }
> >
> > # example output
> > $ spark-submit --master local[1] --class Leak leak/build/libs/leak.jar
> > 17/03/08 11:26:37 WARN NativeCodeLoader: Unable to load native-hadoop
> > library for your platform... using builtin-java classes where
> > applicable
> > 17/03/08 11:26:37 WARN Utils: Your hostname, fd-tweag resolves to a
> > loopback address: 127.0.0.1; using 192.168.1.42 instead (on interface
> > wlan0)
> > 17/03/08 11:26:37 WARN Utils: Set SPARK_LOCAL_IP if you need to bind
> > to another address
> > java total memory: 211288064
> > VmRSS:  200488 kB
> > java total memory: 456654848
> > VmRSS:  656472 kB
> > java total memory: 562036736
> > VmRSS:  677156 kB
> > java total memory: 562561024
> > VmRSS:  689424 kB
> > java total memory: 562561024
> > VmRSS:  701760 kB
> > java total memory: 562561024
> > VmRSS:  732540 kB
> > java total memory: 562561024
> > VmRSS:  748468 kB
> > java total memory: 562036736
> > VmRSS:  770680 kB
> > java total memory: 705691648
> > VmRSS:  789632 kB
> > java total memory: 706740224
> > VmRSS:  802720 kB
> > java total memory: 704118784
> > VmRSS:  832740 kB
> > java total memory: 705691648
> > VmRSS:  850808 kB
> > java total memory: 704118784
> > VmRSS:  875232 kB
> > java total memory: 705691648
> > VmRSS:  898716 kB
> > java total memory: 701497344
> > VmRSS:  919388 kB
> > java total memory: 905445376
> > VmRSS:  942628 kB
> > java total memory: 904921088
> > VmRSS:  989176 kB
> > java total memory: 901251072
> > VmRSS:  999540 kB
> > java total memory: 902823936
> > VmRSS: 1027212 kB
> > java total memory: 903348224
> > VmRSS: 1057668 kB
> > java total memory: 902299648
> > VmRSS: 1070976 kB
> > java total memory: 904396800
> > VmRSS: 1094640 kB
> > java total memory: 897056768
> > VmRSS: 1114612 kB
> > java total memory: 903872512
> > VmRSS: 1142324 kB
> > java total memory: 1050148864
> > VmRSS: 1147836 kB
> > java total memory: 1061158912
> > VmRSS: 1183668 kB
> > java total memory: 1052246016
> > VmRSS: 1211496 kB
> > java total memory: 1058013184
> > VmRSS: 1230696 kB
> > java total memory: 1059061760
> > VmRSS: 1259428 kB
> > java total memory: 1060634624
> > VmRSS: 1284252 kB
> > java total memory: 1055916032
> > VmRSS: 1319460 kB
> > java total memory: 1052246016
> > VmRSS: 1323044 kB
> > java total memory: 1052246016
> > VmRSS: 1323572 kB
> > java total memory: 1052246016
> > VmRSS: 1323836 kB
> > java total memory: 1052246016
> > VmRSS: 1323836 kB
> > java total memory: 1052246016
> > VmRSS: 1324096 kB
> > java total memory: 1052246016
> > VmRSS: 1324096 kB
> > java total memory: 1052246016
> > VmRSS: 1324096 kB
> > ...
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>

Mime
View raw message