spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Akhil Das <>
Subject Re: Memory/Network Intensive Workload
Date Mon, 30 Jun 2014 06:30:21 GMT

Not sure, if this will help you.

1. Create one application that will put files to your S3 bucket from public
data source (You can use public wiki-data)
2. Create another application (SparkStreaming one) which will listen on
that bucket ^^ and perform some operation (Caching, GroupBy etc) as soon as
the data kicks in.

In this way you are able to utilize all network and memory.

Best Regards

On Mon, Jun 30, 2014 at 12:25 AM, danilopds <> wrote:

> Hello,
> I'm studying the Spark platform and I'd like to realize experiments in your
> extension Spark Streaming.
> So,
> I guess that an intensive memory and network workload are a good options.
> Can anyone suggest a few typical Spark Streaming workloads that are
> network/memory intensive?
> If someone have other suggestions for good workloads upon Spark Streaming
> will be interesting too.
> Thanks!
> --
> View this message in context:
> Sent from the Apache Spark User List mailing list archive at

View raw message