spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From puneetloya <>
Subject Spark 2.4.3 - Structured Streaming - high on Storage Memory
Date Sun, 16 Jun 2019 05:08:16 GMT

Just upgraded Spark from 2.2.3 to 2.4.3. 

Ran a load test with a week worth of messages in kafka. Seeing an odd
behavior, why is the storage memory so high? Have run similar workloads with
Spark 2.2.3, have never seen such behavior. Has something pretty basic about
Spark has changed?

Our main changes for 2.4.3:
1) We started using Cassandra Sink Supported in Spark 2.4
2) Moved to Hadoop 3.1.1 from Hadoop 2.7.3. Mainly because we use s3
checkpointing and AWS SDK for 2.7.3 does not have a fix for connection
retries to s3 storage?


Sent from:

To unsubscribe e-mail:

View raw message