spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koert Kuipers <ko...@tresata.com>
Subject Re: Spark enables us to process Big Data on an ARM cluster !!
Date Wed, 19 Mar 2014 14:43:15 GMT
i dont know anything about arm clusters.... but it looks great. what are
the specs? the nodes have no local disk at all?


On Tue, Mar 18, 2014 at 10:36 PM, Chanwit Kaewkasi <chanwit@gmail.com>wrote:

> Hi all,
>
> We are a small team doing a research on low-power (and low-cost) ARM
> clusters. We built a 20-node ARM cluster that be able to start Hadoop.
> But as all of you've known, Hadoop is performing on-disk operations,
> so it's not suitable for a constraint machine powered by ARM.
>
> We then switched to Spark and had to say wow!!
>
> Spark / HDFS enables us to crush Wikipedia articles (of year 2012) of
> size 34GB in 1h50m. We have identified the bottleneck and it's our
> 100M network.
>
> Here's the cluster:
> https://dl.dropboxusercontent.com/u/381580/aiyara_cluster/Mk-I_SSD.png
>
> And this is what we got from Spark's shell:
> https://dl.dropboxusercontent.com/u/381580/aiyara_cluster/result_00.png
>
> I think it's the first ARM cluster that can process a non-trivial size
> of Big Data.
> (Please correct me if I'm wrong)
> I really want to thank the Spark team that makes this possible !!
>
> Best regards,
>
> -chanwit
>
> --
> Chanwit Kaewkasi
> linkedin.com/in/chanwit
>

Mime
View raw message