Not sure, if this will help you.

1. Create one application that will put files to your S3 bucket from public data source (You can use public wiki-data)
2. Create another application (SparkStreaming one) which will listen on that bucket ^^ and perform some operation (Caching, GroupBy etc) as soon as the data kicks in. 

In this way you are able to utilize all network and memory. 

Best Regards

On Mon, Jun 30, 2014 at 12:25 AM, danilopds <danilobits@gmail.com> wrote:
I'm studying the Spark platform and I'd like to realize experiments in your
extension Spark Streaming.

I guess that an intensive memory and network workload are a good options.
Can anyone suggest a few typical Spark Streaming workloads that are
network/memory intensive?

If someone have other suggestions for good workloads upon Spark Streaming
will be interesting too.


View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Memory-Network-Intensive-Workload-tp8501.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.