spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ShreyanshB <shreyanshpbh...@gmail.com>
Subject Re: Graphx : Perfomance comparison over cluster
Date Wed, 23 Jul 2014 18:04:07 GMT
Thanks Ankur.

The version with in-memory shuffle is here:
https://github.com/amplab/graphx2/commits/vldb. Unfortunately Spark has
changed a lot since then, and the way to configure and invoke Spark is
different. I can send you the correct configuration/invocation for this if
you're interested in benchmarking it.

It'd be great if you can tell me how to configure and invoke this spark
version.



On Sun, Jul 20, 2014 at 9:02 PM, ankurdave [via Apache Spark User List] <
ml-node+s1001560n10281h89@n3.nabble.com> wrote:

> On Fri, Jul 18, 2014 at 9:07 PM, ShreyanshB <[hidden email]
> <http://user/SendEmail.jtp?type=node&node=10281&i=0>> wrote:
>
>> Does the suggested version with in-memory shuffle affects performance too
>> much?
>
>
> We've observed a 2-3x speedup from it, at least on larger graphs like
> twitter-2010 <http://law.di.unimi.it/webdata/twitter-2010/> and uk-2007-05
> <http://law.di.unimi.it/webdata/uk-2007-05/>.
>
> (according to previously reported numbers, graphx did 10 iterations in 142
>> seconds and in latest stats it does it in 68 seconds). Is it just the
>> in-memory version which is changed?
>
>
> If you're referring to previous results vs. the arXiv paper, there were
> several improvements, but in-memory shuffle had the largest impact.
>
> Ankur <http://www.ankurdave.com/>
>
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-spark-user-list.1001560.n3.nabble.com/Graphx-Perfomance-comparison-over-cluster-tp10222p10281.html
>  To unsubscribe from Graphx : Perfomance comparison over cluster, click
> here
> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=10222&code=c2hyZXlhbnNocGJoYXR0QGdtYWlsLmNvbXwxMDIyMnwtMTc5NzgyNjk5NQ==>
> .
> NAML
> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Graphx-Perfomance-comparison-over-cluster-tp10222p10523.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
Mime
View raw message