spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Holmberg <jan.holmb...@perigeum.fi>
Subject Re: Stress testing hdfs with Spark
Date Tue, 05 Apr 2016 19:56:44 GMT
I'm trying to get rough estimate how much data I can write within certain time period (GB/sec).
-jan

On 05 Apr 2016, at 22:49, Mich Talebzadeh <mich.talebzadeh@gmail.com<mailto:mich.talebzadeh@gmail.com>>
wrote:

Hi Jan,

What is the definition of stress test in here? What are the matrices? Throughput of data,
latency, velocity, volume?

HTH


Dr Mich Talebzadeh



LinkedIn  https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw



http://talebzadehmich.wordpress.com<http://talebzadehmich.wordpress.com/>



On 5 April 2016 at 20:42, Jan Holmberg <jan.holmberg@perigeum.fi<mailto:jan.holmberg@perigeum.fi>>
wrote:
Hi,
I'm trying to figure out how to write lots of data from each worker. I tried rdd.saveAsTextFile
but got OOM when generating 1024MB string for a worker. Increasing worker memory would mean
that I should drop the number of workers.
Soo, any idea how to write ex. 1gb file from each worker?

cheers,
-jan
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org<mailto:user-unsubscribe@spark.apache.org>
For additional commands, e-mail: user-help@spark.apache.org<mailto:user-help@spark.apache.org>



Mime
View raw message