spark-user mailing list archives: June 2018

Site index · List index
Message list1 · 2 · 3 · 4 · Next »Thread · Author · Date
pchu [Pyspark mllib] RowMatrix.columnSimilarities losing spark context? Fri, 01 Jun, 01:42
Aakash Basu [Spark SQL] Efficiently calculating Weight of Evidence in PySpark Fri, 01 Jun, 08:14
Becket Qin [Spark SQL] Is it possible to do stream to stream inner join without event time? Fri, 01 Jun, 10:10
Swapnil Chougule Spark structured streaming generate output path runtime Fri, 01 Jun, 10:20
Lalwani, Jayesh Re: Spark structured streaming generate output path runtime Fri, 01 Jun, 13:39
Benjamin Kim Append In-Place to S3 Fri, 01 Jun, 16:00
Mohamed Nadjib MAMI Help explaining explain() after DataFrame join reordering Fri, 01 Jun, 16:31
Martin Peng How to work around NoOffsetForPartitionException when using Spark Streaming Fri, 01 Jun, 17:29
Jay Re: Append In-Place to S3 Sat, 02 Jun, 06:49
vincent gromakowski Re: Append In-Place to S3 Sat, 02 Jun, 07:14
Timur Shenkao Re: [Spark2.1] SparkStreaming to Cassandra performance problem Sat, 02 Jun, 11:35
Benjamin Kim Re: Append In-Place to S3 Sat, 02 Jun, 16:56
Aakash Basu Re: Append In-Place to S3 Sat, 02 Jun, 17:22
Pranav Agrawal [Spark SQL] error in performing dataset union with complex data type (struct, list) Sat, 02 Jun, 17:44
Pranav Agrawal [Spark SQL] error in performing dataset union with complex data type (struct, list) Sat, 02 Jun, 17:48
Alessandro Solimando Re: [Spark SQL] error in performing dataset union with complex data type (struct, list) Sun, 03 Jun, 14:35
Tayler Lawrence Jones Re: Append In-Place to S3 Sun, 03 Jun, 20:42
ayan guha Re: Append In-Place to S3 Sun, 03 Jun, 21:46
Tayler Lawrence Jones Re: Append In-Place to S3 Sun, 03 Jun, 21:57
Tayler Lawrence Jones Re: Append In-Place to S3 Sun, 03 Jun, 22:02
Shushant Arora Spark task default timeout Mon, 04 Jun, 03:58
Becket Qin Re: [Spark SQL] Is it possible to do stream to stream inner join without event time? Mon, 04 Jun, 04:13
Sing, Jasbir Sorting in Spark on multiple partitions Mon, 04 Jun, 04:47
Jörn Franke Re: Sorting in Spark on multiple partitions Mon, 04 Jun, 05:17
Pranav Agrawal Re: [Spark SQL] error in performing dataset union with complex data type (struct, list) Mon, 04 Jun, 08:17
Jorge Machado Re: [Spark SQL] error in performing dataset union with complex data type (struct, list) Mon, 04 Jun, 09:01
Pranav Agrawal Re: [Spark SQL] error in performing dataset union with complex data type (struct, list) Mon, 04 Jun, 09:09
Jorge Machado Re: [Spark SQL] error in performing dataset union with complex data type (struct, list) Mon, 04 Jun, 09:25
Spico Florin Re: testing frameworks Mon, 04 Jun, 11:14
Swapnil Chougule Re: Spark structured streaming generate output path runtime Mon, 04 Jun, 11:34
Pranav Agrawal Re: [Spark SQL] error in performing dataset union with complex data type (struct, list) Mon, 04 Jun, 12:04
kant kodali is there a way to create a static dataframe inside mapGroups? Mon, 04 Jun, 12:22
Jörn Franke Re: [External] Re: Sorting in Spark on multiple partitions Mon, 04 Jun, 17:08
Jörn Franke Re: [External] Re: Sorting in Spark on multiple partitions Mon, 04 Jun, 17:28
Jean Georges Perrin A code example of Catalyst optimization Mon, 04 Jun, 18:54
Shuporno Choudhury [PySpark] Releasing memory after a spark job is finished Mon, 04 Jun, 19:37
Jörn Franke Re: [PySpark] Releasing memory after a spark job is finished Mon, 04 Jun, 19:41
Shuporno Choudhury Re: [PySpark] Releasing memory after a spark job is finished Mon, 04 Jun, 20:02
Chetan Khatri Apply Core Java Transformation UDF on DataFrame Mon, 04 Jun, 20:11
Jörn Franke Re: [PySpark] Releasing memory after a spark job is finished Mon, 04 Jun, 20:18
Shuporno Choudhury Re: [PySpark] Releasing memory after a spark job is finished Mon, 04 Jun, 20:26
purna pradeep spark partitionBy with partitioned column in json output Mon, 04 Jun, 23:59
Lalwani, Jayesh Re: spark partitionBy with partitioned column in json output Tue, 05 Jun, 02:41
Jay Re: [PySpark] Releasing memory after a spark job is finished Tue, 05 Jun, 02:41
Jay Re: spark partitionBy with partitioned column in json output Tue, 05 Jun, 02:44
Thakrar, Jayesh Re: [PySpark] Releasing memory after a spark job is finished Tue, 05 Jun, 02:50
Jörn Franke Re: [PySpark] Releasing memory after a spark job is finished Tue, 05 Jun, 05:15
Shuporno Choudhury Re: [PySpark] Releasing memory after a spark job is finished Tue, 05 Jun, 05:38
Elior Malul Re: spark partitionBy with partitioned column in json output Tue, 05 Jun, 06:54
thomas lavocat [Spark Streaming] is spark.streaming.concurrentJobs a per node or a cluster global value ? Tue, 05 Jun, 08:20
Matteo Cossu Re: Help explaining explain() after DataFrame join reordering Tue, 05 Jun, 08:38
kant kodali is there a way to parse and modify raw spark sql query? Tue, 05 Jun, 08:39
Saisai Shao Re: [Spark Streaming] is spark.streaming.concurrentJobs a per node or a cluster global value ? Tue, 05 Jun, 09:24
@Nan...@ Reg:- Py4JError in Windows 10 with Spark Tue, 05 Jun, 09:42
Rico Bergmann Strange codegen error for SortMergeJoin in Spark 2.2.1 Tue, 05 Jun, 10:58
thomas lavocat Re: [Spark Streaming] is spark.streaming.concurrentJobs a per node or a cluster global value ? Tue, 05 Jun, 11:17
Saisai Shao Re: [Spark Streaming] is spark.streaming.concurrentJobs a per node or a cluster global value ? Tue, 05 Jun, 11:44
thomas lavocat Re: [Spark Streaming] is spark.streaming.concurrentJobs a per node or a cluster global value ? Tue, 05 Jun, 11:48
Saisai Shao Re: [Spark Streaming] is spark.streaming.concurrentJobs a per node or a cluster global value ? Tue, 05 Jun, 11:52
Phillip Henry Using checkpoint much, much faster than cache. Why? Tue, 05 Jun, 14:06
alz2 Re: Writing custom Structured Streaming receiver Tue, 05 Jun, 15:55
raksja Dataframe from 1.5G json (non JSONL) Tue, 05 Jun, 18:39
Anastasios Zouzias Re: Dataframe from 1.5G json (non JSONL) Tue, 05 Jun, 18:55
ravidspark Spark maxTaskFailures is not recognized with Cassandra Tue, 05 Jun, 19:19
Holden Karau Re: Dataframe from 1.5G json (non JSONL) Tue, 05 Jun, 20:15
raksja Re: Dataframe from 1.5G json (non JSONL) Tue, 05 Jun, 20:23
raksja Re: Dataframe from 1.5G json (non JSONL) Tue, 05 Jun, 20:30
Nicolas Paris Re: Dataframe from 1.5G json (non JSONL) Tue, 05 Jun, 20:37
raksja Re: Dataframe from 1.5G json (non JSONL) Tue, 05 Jun, 20:40
Chetan Khatri Re: Apply Core Java Transformation UDF on DataFrame Tue, 05 Jun, 20:52
Nicolas Paris Re: Dataframe from 1.5G json (non JSONL) Tue, 05 Jun, 20:55
Aakash Basu [Spark Streaming] Distinct Count on unrelated columns Wed, 06 Jun, 11:02
Behroz Sikander [SparkLauncher] stateChanged event not received in standalone cluster mode Wed, 06 Jun, 12:18
Jay Re: Dataframe from 1.5G json (non JSONL) Wed, 06 Jun, 14:28
Jay Re: Reg:- Py4JError in Windows 10 with Spark Wed, 06 Jun, 14:32
Sing, Jasbir RE: [External] Re: Sorting in Spark on multiple partitions Wed, 06 Jun, 17:27
bsikander Re: [SparkLauncher] stateChanged event not received in standalone cluster mode Wed, 06 Jun, 17:53
Marcelo Vanzin Re: [SparkLauncher] stateChanged event not received in standalone cluster mode Wed, 06 Jun, 17:56
sha...@apache.org FINAL REMINDER: Apache EU Roadshow 2018 in Berlin next week! Wed, 06 Jun, 18:57
raksja Re: Dataframe from 1.5G json (non JSONL) Wed, 06 Jun, 21:23
spark receiver Re: Hive to Oracle using Spark - Type(Date) conversion issue Wed, 06 Jun, 21:48
Holden Karau Spark ML online serving Thu, 07 Jun, 00:10
licl Re: Apache Spark Structured Streaming - Kafka Streaming - Option to ignore checkpoint Thu, 07 Jun, 01:09
bis_g Pyspark Join and then column select is showing unexpected output Thu, 07 Jun, 01:58
李斌松 If there is timestamp type data in DF, Spark 2.3 toPandas is much slower than spark 2.2. Thu, 07 Jun, 04:22
amihay gonen Re: Apache Spark Structured Streaming - Kafka Streaming - Option to ignore checkpoint Thu, 07 Jun, 05:18
Kazuaki Ishizaki Re: Strange codegen error for SortMergeJoin in Spark 2.2.1 Thu, 07 Jun, 06:49
Luciano Resende [ANNOUNCE] Apache Bahir 2.1.2 Released Thu, 07 Jun, 08:53
Aakash Basu Fundamental Question on Spark's distribution Thu, 07 Jun, 09:53
杜斌 Register UDF duration runtime Thu, 07 Jun, 10:32
Benjamin Kim Re: Append In-Place to S3 Thu, 07 Jun, 14:56
Javier Pareja Long and consistent wait between tasks in streaming job Thu, 07 Jun, 16:44
Guillermo Ortiz Fernández Reset the offsets, Kafka 0.10 and Spark Thu, 07 Jun, 20:27
Javier Pareja Re: Long and consistent wait between tasks in streaming job Thu, 07 Jun, 21:59
Javier Pareja Re: Long and consistent wait between tasks in streaming job Thu, 07 Jun, 23:31
s...@draves.org [announce] BeakerX supports Scala+Spark in Jupyter Thu, 07 Jun, 23:33
Stephen Boesch Re: [announce] BeakerX supports Scala+Spark in Jupyter Fri, 08 Jun, 00:51
s...@draves.org Re: [announce] BeakerX supports Scala+Spark in Jupyter Fri, 08 Jun, 00:55
Irving Duran Re: If there is timestamp type data in DF, Spark 2.3 toPandas is much slower than spark 2.2. Fri, 08 Jun, 01:04
Kyunam Kim how to call database specific function when reading writing thru jdbc Fri, 08 Jun, 01:08
Message list1 · 2 · 3 · 4 · Next »Thread · Author · Date
Box list
May 2019250
Apr 2019263
Mar 2019248
Feb 2019186
Jan 2019244
Dec 2018202
Nov 2018235
Oct 2018275
Sep 2018235
Aug 2018262
Jul 2018309
Jun 2018377
May 2018386
Apr 2018410
Mar 2018444
Feb 2018383
Jan 2018332
Dec 2017350
Nov 2017267
Oct 2017410
Sep 2017452
Aug 2017525
Jul 2017520
Jun 2017645
May 2017549
Apr 2017564
Mar 2017621
Feb 2017744
Jan 2017889
Dec 2016865
Nov 20161118
Oct 20161115
Sep 20161402
Aug 20161564
Jul 20161684
Jun 20161457
May 20161496
Apr 20161411
Mar 20162044
Feb 20161799
Jan 20161740
Dec 20151870
Nov 20151541
Oct 20152041
Sep 20152125
Aug 20151978
Jul 20152343
Jun 20152366
May 20151864
Apr 20152314
Mar 20152577
Feb 20152187
Jan 20152152
Dec 20141937
Nov 20142024
Oct 20142244
Sep 20142094
Aug 20141949
Jul 20142389
Jun 20141773
May 20141397
Apr 20141459
Mar 20141286
Feb 20141029
Jan 2014925
Dec 2013611
Nov 2013558
Oct 2013505
Sep 2013235
Aug 201397
Jul 20137