spark-user mailing list archives: November 2019

Site index · List index
Message list1 · 2 · Next »Thread · Author · Date
Gourav Sengupta Re: Delta with intelligent upsett Fri, 01 Nov, 06:52
Roland Johann Re: Delta with intelligent upsett Fri, 01 Nov, 06:57
Holden Karau Re: pyspark - memory leak leading to OOM after submitting 100 jobs? Fri, 01 Nov, 11:08
Patrick McCarthy Best practices for data like file storage Fri, 01 Nov, 15:33
grp XGBoost Spark One Model Per Worker Integration Fri, 01 Nov, 16:34
Burak Yavuz Re: Delta with intelligent upsett Sun, 03 Nov, 04:57
Sam Avro file question Mon, 04 Nov, 17:03
Yaniv Harpaz Re: Avro file question Mon, 04 Nov, 17:28
ayan guha Re: Avro file question Mon, 04 Nov, 19:22
Bryan Cutler [DISCUSS] Remove sorting of fields in PySpark SQL Row construction Mon, 04 Nov, 22:28
zhangliyun A question about skew join hint Tue, 05 Nov, 01:21
sora How to use spark-on-k8s pod template? Tue, 05 Nov, 11:37
aka.fe2s static dataframe to streaming Tue, 05 Nov, 20:23
Mina Aslani 'requirement failed: OneHotEncoderModel expected x categorical values for input column label, but the input column had metadata specifying n values.' Tue, 05 Nov, 20:55
Mina Aslani Re: 'requirement failed: OneHotEncoderModel expected x categorical values for input column label, but the input column had metadata specifying n values.' Wed, 06 Nov, 04:13
Rishi Shah [pyspark 2.3.0] Task was denied committing errors Wed, 06 Nov, 12:30
Ashish Mittal Working failed to connect to master in Spark Apache Wed, 06 Nov, 13:44
Wenchen Fan Re: [DISCUSS] Remove sorting of fields in PySpark SQL Row construction Wed, 06 Nov, 14:38
Jeff Evans What's the deal with --proxy-user? Wed, 06 Nov, 22:49
Klaus Ma Re: Build customized resource manager Thu, 07 Nov, 01:22
Rishi Shah Re: [pyspark 2.3.0] Task was denied committing errors Thu, 07 Nov, 01:27
V0lleyBallJunki3 Can reduced parallelism lead to no shuffle spill? Thu, 07 Nov, 16:14
Alexander Czech Re: Can reduced parallelism lead to no shuffle spill? Thu, 07 Nov, 18:36
abeboparebop Re: Driver OutOfMemoryError in MapOutputTracker$.serializeMapStatuses for 40 TB shuffle. Thu, 07 Nov, 18:37
Xingbo Jiang [ANNOUNCE] Announcing Apache Spark 3.0.0-preview Thu, 07 Nov, 22:53
Hyukjin Kwon Re: [DISCUSS] Remove sorting of fields in PySpark SQL Row construction Fri, 08 Nov, 02:08
Shane Knapp Re: [DISCUSS] Remove sorting of fields in PySpark SQL Row construction Fri, 08 Nov, 02:54
Takuya UESHIN Re: [DISCUSS] Remove sorting of fields in PySpark SQL Row construction Fri, 08 Nov, 03:01
V0lleyBallJunki3 Re: Can reduced parallelism lead to no shuffle spill? Fri, 08 Nov, 03:11
Spico Florin Re: Driver OutOfMemoryError in MapOutputTracker$.serializeMapStatuses for 40 TB shuffle. Fri, 08 Nov, 09:01
Jacob Lynn Re: Driver OutOfMemoryError in MapOutputTracker$.serializeMapStatuses for 40 TB shuffle. Fri, 08 Nov, 09:54
Vadim Semenov Re: Driver OutOfMemoryError in MapOutputTracker$.serializeMapStatuses for 40 TB shuffle. Fri, 08 Nov, 13:23
Tom Graves Re: Build customized resource manager Fri, 08 Nov, 14:18
David Mitchell Re: How to use spark-on-k8s pod template? Fri, 08 Nov, 16:18
Jacob Lynn Re: Driver OutOfMemoryError in MapOutputTracker$.serializeMapStatuses for 40 TB shuffle. Fri, 08 Nov, 21:06
Bartosz Konieczny Why Spark generates Java code and not Scala? Sat, 09 Nov, 17:46
Marcin Tustin Re: Why Spark generates Java code and not Scala? Sat, 09 Nov, 18:36
Holden Karau Re: Why Spark generates Java code and not Scala? Sun, 10 Nov, 12:56
Gal Benshlomo RE: PySpark Pandas UDF Sun, 10 Nov, 15:31
Holden Karau Re: PySpark Pandas UDF Sun, 10 Nov, 15:53
Rishi Shah Re: [pyspark 2.3.0] Task was denied committing errors Sun, 10 Nov, 19:24
Nicolas Paris announce: spark-postgres 3 released Mon, 11 Nov, 00:02
Klaus Ma Re: Build customized resource manager Mon, 11 Nov, 02:24
Akshay Bhardwaj Re: spark streaming exception Mon, 11 Nov, 06:22
gal.benshlomo Re: PySpark Pandas UDF Mon, 11 Nov, 09:41
Tzahi File Using Percentile in Spark SQL Mon, 11 Nov, 14:45
Jerry Vinokurov Re: Using Percentile in Spark SQL Mon, 11 Nov, 15:13
Patrick McCarthy Re: Using Percentile in Spark SQL Mon, 11 Nov, 15:16
Muthu Jayakumar Re: Using Percentile in Spark SQL Mon, 11 Nov, 15:26
Marcin Tustin Re: Why Spark generates Java code and not Scala? Mon, 11 Nov, 15:27
Tzahi File Re: Using Percentile in Spark SQL Mon, 11 Nov, 15:33
Vadim Semenov Re: Driver OutOfMemoryError in MapOutputTracker$.serializeMapStatuses for 40 TB shuffle. Mon, 11 Nov, 15:42
Jerry Vinokurov Re: Using Percentile in Spark SQL Mon, 11 Nov, 15:55
Bin Fan Re: What is directory "/path/_spark_metadata" for? Mon, 11 Nov, 23:44
Chang Chen Is RDD thread safe? Tue, 12 Nov, 01:48
lk_spark how to limit tasks num when read hive with orc Tue, 12 Nov, 05:56
Jacob Lynn Re: Driver OutOfMemoryError in MapOutputTracker$.serializeMapStatuses for 40 TB shuffle. Tue, 12 Nov, 09:13
sora RE:How to use spark-on-k8s pod template? Tue, 12 Nov, 10:46
gal.benshlomo RE: PySpark Pandas UDF Tue, 12 Nov, 15:42
Holden Karau Re: PySpark Pandas UDF Tue, 12 Nov, 15:53
Laurent Bastien Corbeil Temporary tables for Spark SQL Tue, 12 Nov, 21:01
Bryan Cutler Re: [DISCUSS] Remove sorting of fields in PySpark SQL Row construction Wed, 13 Nov, 07:27
Anastasios Zouzias [Structured Streaming] Robust watermarking calculation with future timestamps Wed, 13 Nov, 09:57
asma zgolli error , saving dataframe , LEGACY_PASS_PARTITION_BY_AS_OPTIONS Wed, 13 Nov, 14:52
Femi Anthony Re: error , saving dataframe , LEGACY_PASS_PARTITION_BY_AS_OPTIONS Wed, 13 Nov, 15:05
asma zgolli Re: error , saving dataframe , LEGACY_PASS_PARTITION_BY_AS_OPTIONS Wed, 13 Nov, 15:17
Russell Spitzer Re: error , saving dataframe , LEGACY_PASS_PARTITION_BY_AS_OPTIONS Wed, 13 Nov, 15:19
anbutech Explode/Flatten Map type Data Using Pyspark Thu, 14 Nov, 17:50
ayan guha Re: Explode/Flatten Map type Data Using Pyspark Fri, 15 Nov, 03:16
anbutech Re: Explode/Flatten Map type Data Using Pyspark Fri, 15 Nov, 03:30
ayan guha Re: Explode/Flatten Map type Data Using Pyspark Fri, 15 Nov, 05:29
Sivaprasanna Is there a merge API available for writing DataFrame Fri, 15 Nov, 08:47
ayan guha Re: Is there a merge API available for writing DataFrame Fri, 15 Nov, 11:11
Bryan Cutler Re: PySpark Pandas UDF Mon, 18 Nov, 07:08
Gourav Sengupta Re: PySpark Pandas UDF Mon, 18 Nov, 07:16
Bryan Jeffrey Structured Streaming & Enrichment Broadcasts Mon, 18 Nov, 14:20
Wim Van Leuven Performance of PySpark 2.3.2 on Microsoft Windows Mon, 18 Nov, 15:54
Alfredo Marquez SparkR integration with Hive 3 spark-r Mon, 18 Nov, 17:23
Nicolas Paris Re: SparkR integration with Hive 3 spark-r Mon, 18 Nov, 18:53
Alfredo Marquez Re: SparkR integration with Hive 3 spark-r Mon, 18 Nov, 21:00
Burak Yavuz Re: Structured Streaming & Enrichment Broadcasts Tue, 19 Nov, 04:22
Sonal Goyal Re: Is RDD thread safe? Tue, 19 Nov, 13:46
bsikander Spark 2.4.4 with Hadoop 3.2.0 Tue, 19 Nov, 14:24
Alfredo Marquez Re: Spark 2.4.4 with Hadoop 3.2.0 Tue, 19 Nov, 14:43
Punya Maremalla I am testing on Spark 3.0 preview release Tue, 19 Nov, 15:26
Roland Johann Structured Streaming Kafka change maxOffsetsPerTrigger won't apply Wed, 20 Nov, 08:33
Gabor Somogyi Re: Structured Streaming Kafka change maxOffsetsPerTrigger won't apply Wed, 20 Nov, 11:23
Jiang, Yi J (CWM-NR) Spark onApplicationEnd run multiple times during the application failure Wed, 20 Nov, 22:05
Valerie Hayot [PySpark] Understanding the times reported by PythonRunner Wed, 20 Nov, 23:10
hemant singh Re: Spark onApplicationEnd run multiple times during the application failure Thu, 21 Nov, 08:12
Jiang, Yi J (CWM-NR) RE: Spark onApplicationEnd run multiple times during the application failure Thu, 21 Nov, 13:54
hemant singh Re: Spark onApplicationEnd run multiple times during the application failure Thu, 21 Nov, 13:56
Marcelo Valle join with just 1 record causes all data to go to a single node Thu, 21 Nov, 15:51
Alfredo Marquez Re: SparkR integration with Hive 3 spark-r Sat, 23 Nov, 00:26
Aniruddha P Tekade Can spark convert String to Integer when reading using schema in structured streaming Sat, 23 Nov, 01:17
Rishi Shah [pyspark 2.4] maxrecordsperfile option Sun, 24 Nov, 04:36
Felix Cheung Re: SparkR integration with Hive 3 spark-r Sun, 24 Nov, 20:20
Chang Chen Re: Is RDD thread safe? Mon, 25 Nov, 03:10
lk_spark how spark structrued stream write to kudu Mon, 25 Nov, 08:00
lk_spark Re: how spark structrued stream write to kudu Mon, 25 Nov, 08:24
Message list1 · 2 · Next »Thread · Author · Date
Box list
Feb 202043
Jan 2020141
Dec 2019138
Nov 2019125
Oct 2019124
Sep 2019160
Aug 2019187
Jul 2019193
Jun 2019265
May 2019317
Apr 2019263
Mar 2019248
Feb 2019186
Jan 2019244
Dec 2018202
Nov 2018235
Oct 2018275
Sep 2018235
Aug 2018262
Jul 2018309
Jun 2018377
May 2018386
Apr 2018410
Mar 2018444
Feb 2018383
Jan 2018332
Dec 2017350
Nov 2017267
Oct 2017410
Sep 2017452
Aug 2017525
Jul 2017520
Jun 2017645
May 2017549
Apr 2017564
Mar 2017621
Feb 2017744
Jan 2017889
Dec 2016865
Nov 20161118
Oct 20161115
Sep 20161402
Aug 20161564
Jul 20161684
Jun 20161457
May 20161496
Apr 20161411
Mar 20162044
Feb 20161799
Jan 20161740
Dec 20151870
Nov 20151541
Oct 20152041
Sep 20152125
Aug 20151978
Jul 20152343
Jun 20152366
May 20151864
Apr 20152314
Mar 20152577
Feb 20152187
Jan 20152152
Dec 20141937
Nov 20142024
Oct 20142244
Sep 20142094
Aug 20141949
Jul 20142389
Jun 20141773
May 20141397
Apr 20141459
Mar 20141286
Feb 20141029
Jan 2014925
Dec 2013611
Nov 2013558
Oct 2013505
Sep 2013235
Aug 201397
Jul 20137