spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From oppokui <oppo...@gmail.com>
Subject Re: Support R in Spark
Date Sat, 06 Sep 2014 17:44:57 GMT
Thanks, Christopher. I saw it before, it is amazing. Last time I try to download it from adatao,
but no response after filling the table. How can I download it or its source code? What is
the license?

Kui


> On Sep 6, 2014, at 8:08 PM, Christopher Nguyen <ctn@adatao.com> wrote:
> 
> Hi Kui,
> 
> DDF (open sourced) also aims to do something similar, adding RDBMS idioms, and is already
implemented on top of Spark.
> 
> One philosophy is that the DDF API aggressively hides the notion of parallel datasets,
exposing only (mutable) tables to users, on which they can apply R and other familiar data
mining/machine learning idioms, without having to know about the distributed representation
underneath. Now, you can get to the underlying RDDs if you want to, simply by asking for it.
> 
> This was launched at the July Spark Summit. See http://spark-summit.org/2014/talk/distributed-dataframe-ddf-on-apache-spark-simplifying-big-data-for-the-rest-of-us
.
> 
> Sent while mobile. Please excuse typos etc.
> 
> On Sep 4, 2014 1:59 PM, "Shivaram Venkataraman" <shivaram@eecs.berkeley.edu> wrote:
> Thanks Kui. SparkR is a pretty young project, but there are a bunch of
> things we are working on. One of the main features is to expose a data
> frame API (https://sparkr.atlassian.net/browse/SPARKR-1) and we will
> be integrating this with Spark's MLLib.  At a high-level this will
> allow R users to use a familiar API but make use of MLLib's efficient
> distributed implementation. This is the same strategy used in Python
> as well.
> 
> Also we do hope to merge SparkR with mainline Spark -- we have a few
> features to complete before that and plan to shoot for integration by
> Spark 1.3.
> 
> Thanks
> Shivaram
> 
> On Wed, Sep 3, 2014 at 9:24 PM, oppokui <oppokui@gmail.com> wrote:
> > Thanks, Shivaram.
> >
> > No specific use case yet. We try to use R in our project as data scientest
> > are all knowing R. We had a concern that how R handles the mass data. Spark
> > does a better work on big data area, and Spark ML is focusing on predictive
> > analysis area. Then we are thinking whether we can merge R and Spark
> > together. We tried SparkR and it is pretty easy to use. But we didn’t see
> > any feedback on this package in industry. It will be better if Spark team
> > has R support just like scala/Java/Python.
> >
> > Another question is that MLlib will re-implement all famous data mining
> > algorithms in Spark, then what is the purpose of using R?
> >
> > There is another technique for us H2O which support R natively. H2O is more
> > friendly to data scientist. I saw H2O can also work on Spark (Sparkling
> > Water).  It is better than using SparkR?
> >
> > Thanks and Regards.
> >
> > Kui
> >
> >
> > On Sep 4, 2014, at 1:47 AM, Shivaram Venkataraman
> > <shivaram@eecs.berkeley.edu> wrote:
> >
> > Hi
> >
> > Do you have a specific use-case where SparkR doesn't work well ? We'd love
> > to hear more about use-cases and features that can be improved with SparkR.
> >
> > Thanks
> > Shivaram
> >
> >
> > On Wed, Sep 3, 2014 at 3:19 AM, oppokui <oppokui@gmail.com> wrote:
> >>
> >> Does spark ML team have plan to support R script natively? There is a
> >> SparkR project, but not from spark team. Spark ML used netlib-java to talk
> >> with native fortran routines or use NumPy, why not try to use R in some
> >> sense.
> >>
> >> R had lot of useful packages. If spark ML team can include R support, it
> >> will be a very powerful.
> >>
> >> Any comment?
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> >> For additional commands, e-mail: user-help@spark.apache.org
> >>
> >
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
> 


Mime
View raw message