spark-user mailing list archives: August 2015

Site index · List index
Message list« Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 · 9 · 10 · 11 · 12 · 13 · 14 · 15 · 16 · 17 · 18 · 19 · 20 · Next »Thread · Author · Date
Umesh Kacha Re: How to avoid executor time out on yarn spark while dealing with large shuffle skewed data? Thu, 20 Aug, 15:54
Umesh Kacha Re: How to avoid executor time out on yarn spark while dealing with large shuffle skewed data? Fri, 21 Aug, 15:13
Umesh Kacha Re: How to avoid executor time out on yarn spark while dealing with large shuffle skewed data? Wed, 26 Aug, 18:51
Umesh Kacha RE: How to avoid executor time out on yarn spark while dealing with large shuffle skewed data? Sun, 30 Aug, 09:34
Umesh Kacha Re: Spark executor OOM issue on YARN Mon, 31 Aug, 18:19
Upen N Topology.py -- Cannot run on Spark Gateway on Cloudera 5.4.4. Tue, 04 Aug, 02:10
Upen N Re: Topology.py -- Cannot run on Spark Gateway on Cloudera 5.4.4. Tue, 04 Aug, 15:46
Utkarsh Patkar Re: Performance - Python streaming v/s Scala streaming Tue, 25 Aug, 11:58
Utkarsh Sengar Exclude slf4j-log4j12 from the classpath via spark-submit Mon, 24 Aug, 21:50
Utkarsh Sengar Re: Exclude slf4j-log4j12 from the classpath via spark-submit Mon, 24 Aug, 22:15
Utkarsh Sengar Re: Exclude slf4j-log4j12 from the classpath via spark-submit Mon, 24 Aug, 22:58
Utkarsh Sengar Re: Exclude slf4j-log4j12 from the classpath via spark-submit Mon, 24 Aug, 23:32
Utkarsh Sengar Re: Exclude slf4j-log4j12 from the classpath via spark-submit Tue, 25 Aug, 00:05
Utkarsh Sengar Re: Exclude slf4j-log4j12 from the classpath via spark-submit Tue, 25 Aug, 17:48
Utkarsh Sengar Re: Exclude slf4j-log4j12 from the classpath via spark-submit Tue, 25 Aug, 20:50
Utkarsh Sengar Re: Exclude slf4j-log4j12 from the classpath via spark-submit Tue, 25 Aug, 21:30
Utkarsh Sengar Porting a multit-hreaded compute intensive job to spark Thu, 27 Aug, 20:32
VIJAYAKUMAR JAWAHARLAL Left outer joining big data set with small lookups Fri, 14 Aug, 13:39
VIJAYAKUMAR JAWAHARLAL Re: Left outer joining big data set with small lookups Mon, 17 Aug, 16:39
VIJAYAKUMAR JAWAHARLAL Re: Left outer joining big data set with small lookups Tue, 18 Aug, 17:30
VIJAYAKUMAR JAWAHARLAL COMPUTE STATS on hive table - NoSuchTableException Tue, 18 Aug, 18:19
VIJAYAKUMAR JAWAHARLAL What is the reason for ExecutorLostFailure? Tue, 18 Aug, 22:26
VIJAYAKUMAR JAWAHARLAL Re: What is the reason for ExecutorLostFailure? Wed, 19 Aug, 13:24
VIJAYAKUMAR JAWAHARLAL Data frame created from hive table and its partition Thu, 20 Aug, 14:29
VIJAYAKUMAR JAWAHARLAL Re: Data frame created from hive table and its partition Thu, 20 Aug, 19:47
Varadhan, Jawahar Re: Setting up Spark/flume/? to Ingest 10TB from FTP Fri, 14 Aug, 21:11
Varadhan, Jawahar Spark (1.2.0) submit fails with exception saying log directory already exists Tue, 25 Aug, 16:37
Vikram Kone Spark job workflow engine recommendations Fri, 07 Aug, 15:43
Vikram Kone Re: Spark job workflow engine recommendations Fri, 07 Aug, 16:01
Vikram Kone Re: Spark job workflow engine recommendations Fri, 07 Aug, 17:54
Vikram Kone Re: Spark job workflow engine recommendations Fri, 07 Aug, 22:55
Vikram Kone Re: Spark job workflow engine recommendations Tue, 11 Aug, 22:13
Vikram Kone How to run spark in standalone mode on cassandra with high availability? Sat, 15 Aug, 07:33
Virgil Palanciuc Finding the number of executors. Fri, 21 Aug, 14:42
Virgil Palanciuc Re: Finding the number of executors. Fri, 21 Aug, 20:51
Virgil Palanciuc Re: Finding the number of executors. Wed, 26 Aug, 14:19
Warfish Enum values in custom objects mess up RDD operations Thu, 06 Aug, 08:41
Wayne Song Exceptions in threads in executor code don't get caught properly Mon, 31 Aug, 19:52
William Briggs Re: Scala: How to match a java object???? Tue, 18 Aug, 19:46
William Briggs Re: How to automatically relaunch a Driver program after crashes? Wed, 19 Aug, 11:48
William Kinney Re: Setting a stage timeout Tue, 04 Aug, 11:39
Wu, James C. SparkSQL: remove jar added by "add jar " command from dependencies Fri, 07 Aug, 17:29
Wu, James C. SparkSQL: "add jar" blocks all queries Fri, 07 Aug, 19:40
Wu, James C. Re: SparkSQL: "add jar" blocks all queries Fri, 07 Aug, 20:58
Wyss Michael (wysm) Spark 1.3.0: ExecutorLostFailure depending on input file size Thu, 13 Aug, 17:33
Xiangrui Meng Re: Why transformer from ml.Pipeline transform only a DataFrame ? Fri, 28 Aug, 14:07
Xiao JIANG How to get total CPU consumption for Spark job Fri, 07 Aug, 22:06
Xiao JIANG RDD.join vs spark SQL join Thu, 13 Aug, 19:55
Xiao JIANG RE: RDD.join vs spark SQL join Sat, 15 Aug, 22:42
Xu (Simon) Chen Re: Using spark streaming to load data from Kafka to HDFS Sun, 23 Aug, 00:50
Yadid Ayzenberg spark 1.4.1 - LZFException Sat, 22 Aug, 19:57
Yakubovich, Alexey Unsupported major.minor version 51.0 Tue, 11 Aug, 14:55
Yana Kadiyska [SQL/Hive] Trouble with refreshTable Tue, 25 Aug, 16:51
Yana Kadiyska Re: How to unit test HiveContext without OutOfMemoryError (using sbt) Tue, 25 Aug, 20:12
Yanbo Liang Re: Extremely poor predictive performance with RF in mllib Tue, 04 Aug, 10:42
Yanbo Liang Re: TFIDF Transformation Tue, 04 Aug, 11:03
Yanbo Liang Re: Difference between RandomForestModel and RandomForestClassificationModel Wed, 05 Aug, 02:12
Yanbo Liang Re: Extremely poor predictive performance with RF in mllib Thu, 06 Aug, 06:26
Yanbo Liang Re: How to binarize data in spark Fri, 07 Aug, 05:36
Yanbo Liang Re: Convert mllib.linalg.Matrix to Breeze Thu, 20 Aug, 10:28
Yanbo Liang Re: Random Forest and StringIndexer in pyspark ML Pipeline Fri, 21 Aug, 10:35
Yanbo Liang Re: spark not launching in yarn-cluster mode Tue, 25 Aug, 08:56
Yann ROBIN Invalid environment variable name when submitting job from windows Tue, 25 Aug, 07:55
YaoPau Error when running pyspark/shell.py to set up iPython notebook Mon, 10 Aug, 04:38
YaoPau collect() works, take() returns "ImportError: No module named iter" Mon, 10 Aug, 21:53
YaoPau Spark 1.3 + Parquet: "Skipping data using statistics" Wed, 12 Aug, 22:11
YaoPau Re: collect() works, take() returns "ImportError: No module named iter" Wed, 12 Aug, 22:19
YaoPau Run Spark job from within iPython+Spark? Mon, 24 Aug, 20:06
YaoPau Pyspark ImportError: No module named definitions Tue, 25 Aug, 15:37
Yasemin Kaya Amazon DynamoDB & Spark Fri, 07 Aug, 13:08
Yasemin Kaya Re: Amazon DynamoDB & Spark Fri, 07 Aug, 18:22
Yasemin Kaya java.lang.ClassNotFoundException Sat, 08 Aug, 07:00
Yasemin Kaya Re: java.lang.ClassNotFoundException Sat, 08 Aug, 18:22
Yasemin Kaya EC2 cluster doesn't work saveAsTextFile Mon, 10 Aug, 12:08
Yasemin Kaya Re: EC2 cluster doesn't work saveAsTextFile Mon, 10 Aug, 12:58
Yiannis Gkoufas Re: Sorted Multiple Outputs Fri, 14 Aug, 09:35
Yin Huai Re: About Databricks's spark-sql-perf Thu, 13 Aug, 18:30
Yin Huai Re: SQLContext Create Table Problem Wed, 19 Aug, 15:59
Yin Huai Re: How to evaluate custom UDF over window Mon, 24 Aug, 16:09
Young, Matthew T Getting number of physical machines in Spark Thu, 27 Aug, 17:01
Zhan Zhang Re: Error when saving a dataframe as ORC file Sun, 23 Aug, 20:25
Ziqi Zhang automatically determine cluster number Fri, 07 Aug, 10:02
Zsombor Egyed How to connect to spark remotely from java Mon, 10 Aug, 11:44
Zsombor Egyed Re: How to connect to spark remotely from java Mon, 10 Aug, 12:26
Zsombor Egyed Spark MLLIB multiclass calssification Sun, 30 Aug, 04:23
Zsombor Egyed Re: Spark MLLIB multiclass calssification Sun, 30 Aug, 05:49
abellet Memory-efficient successive calls to repartition() Thu, 20 Aug, 09:26
abraithwaite Spark-submit fails when jar is in HDFS Fri, 07 Aug, 05:54
ai he Re: Sporadic "Input validation failed" error when executing LogisticRegressionWithLBFGS.train Tue, 11 Aug, 23:03
ai he Re: Re: Job aborted due to stage failure: java.lang.StringIndexOutOfBoundsException: String index out of range: 18 Sat, 29 Aug, 05:18
alexeyy3 Unsupported major.minor version 51.0 Tue, 11 Aug, 16:06
alexis GILLAIN MLlib Prefixspan implementation Thu, 20 Aug, 16:07
alexis GILLAIN Re: Memory-efficient successive calls to repartition() Mon, 24 Aug, 08:39
alexis GILLAIN Re: MLlib Prefixspan implementation Wed, 26 Aug, 07:10
allonsy Total delay per batch in a CSV file Tue, 04 Aug, 10:58
allonsy Kafka direct approach: blockInterval and topic partitions Mon, 10 Aug, 17:52
allonsy spark.streaming.maxRatePerPartition parameter: what are the benefits? Thu, 13 Aug, 14:50
ambujhbti Re: what determine the task size? Fri, 21 Aug, 02:01
andrew.row...@thomsonreuters.com Driver running out of memory - caused by many tasks? Thu, 27 Aug, 10:53
andrew.row...@thomsonreuters.com Re: Driver running out of memory - caused by many tasks? Thu, 27 Aug, 11:16
Message list« Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 · 9 · 10 · 11 · 12 · 13 · 14 · 15 · 16 · 17 · 18 · 19 · 20 · Next »Thread · Author · Date
Box list
Sep 2021114
Aug 2021171
Jul 2021158
Jun 2021179
May 2021187
Apr 2021267
Mar 2021346
Feb 2021166
Jan 2021242
Dec 2020203
Nov 2020147
Oct 2020236
Sep 2020136
Aug 2020166
Jul 2020248
Jun 2020263
May 2020282
Apr 2020335
Mar 2020232
Feb 2020136
Jan 2020141
Dec 2019138
Nov 2019125
Oct 2019124
Sep 2019160
Aug 2019187
Jul 2019193
Jun 2019265
May 2019317
Apr 2019263
Mar 2019248
Feb 2019186
Jan 2019244
Dec 2018202
Nov 2018235
Oct 2018275
Sep 2018235
Aug 2018262
Jul 2018309
Jun 2018377
May 2018386
Apr 2018410
Mar 2018444
Feb 2018383
Jan 2018332
Dec 2017350
Nov 2017267
Oct 2017410
Sep 2017452
Aug 2017525
Jul 2017520
Jun 2017645
May 2017549
Apr 2017564
Mar 2017621
Feb 2017744
Jan 2017889
Dec 2016865
Nov 20161118
Oct 20161115
Sep 20161402
Aug 20161564
Jul 20161684
Jun 20161457
May 20161496
Apr 20161411
Mar 20162044
Feb 20161799
Jan 20161740
Dec 20151870
Nov 20151541
Oct 20152041
Sep 20152125
Aug 20151978
Jul 20152343
Jun 20152366
May 20151864
Apr 20152314
Mar 20152577
Feb 20152187
Jan 20152152
Dec 20141937
Nov 20142024
Oct 20142244
Sep 20142094
Aug 20141949
Jul 20142389
Jun 20141773
May 20141397
Apr 20141459
Mar 20141286
Feb 20141029
Jan 2014925
Dec 2013611
Nov 2013558
Oct 2013505
Sep 2013235
Aug 201397
Jul 20137