spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bert Greevenbosch <Bert.Greevenbo...@huawei.com>
Subject RE: Artificial Neural Network in Spark?
Date Thu, 03 Jul 2014 07:04:23 GMT
Hi Debasish, all,

Thanks for your feedback. I have submitted the code to GitHub and created a Jira ticket (links
below).

The ANN uses back-propagation with the Steepest Gradient Descent (SGD) method.

Best regards,
Bert

https://github.com/apache/spark/pull/1290
https://issues.apache.org/jira/browse/SPARK-2352


 
> -----Original Message-----
> From: Debasish Das [mailto:debasish.das83@gmail.com]
> Sent: 01 July 2014 12:21
> To: dev@spark.apache.org
> Subject: Re: Artificial Neural Network in Spark?
> 
> I will let Xiangrui to comment on the PR process to add the code in
> mllib
> but I would love to look into your initial version if you push it to
> github...
> 
> As far as I remember Quoc got his best ANN results using back-
> propagation
> algorithm and solved using CG...do you have those features or you are
> using
> SGD style update....
> 
> 
> 
> On Mon, Jun 30, 2014 at 8:13 PM, Bert Greevenbosch <
> Bert.Greevenbosch@huawei.com> wrote:
> 
> > Hi Debasish, Alexander, all,
> >
> > Indeed I found the OpenDL project through the Powered by Spark page.
> I'll
> > need some time to look into the code, but on the first sight it looks
> quite
> > well-developed. I'll contact the author about this too.
> >
> > My own implementation (in Scala) works for multiple inputs and
> multiple
> > outputs. It implements a single hidden layer, the number of nodes in
> it can
> > be specified.
> >
> > The implementation is a general ANN implementation. As such, it
> should be
> > useable for an autoencoder too, since that is just an ANN with some
> special
> > input/output constraints.
> >
> > As said before, the implementation is built upon the linear
> regression
> > model and gradient descent implementation. However it did require
> some
> > tweaks:
> >
> > - The linear regression model only supports a single output "label"
> (as
> > Double). Since the ANN can have multiple outputs, it ignores the
> "label"
> > attribute, but for training divides the input vector into two parts,
> the
> > first part being the genuine input vector, the second the target
> output
> > vector.
> >
> > - The concatenation of input and target output vectors is only
> internally,
> > the training function takes as input an RDD with tuples of two
> Vectors, one
> > for each input and output.
> >
> > - The GradientDescend optimizer is re-used without modification.
> >
> > - I have made an even simpler updater than the SimpleUpdater, leaving
> out
> > the division by the square root of the number of iterations. The
> > SimpleUpdater can also be used, but I created this simpler one
> because I
> > like to plot the result every now and then, and then continue the
> > calculations. For this, I also wrote a training function with as
> input the
> > weights from the previous training session.
> >
> > - I created a ParallelANNModel similar to the LinearRegressionModel.
> >
> > - I created a new GeneralizedSteepestDescendAlgorithm class similar
> to the
> > GeneralizedLinearAlgorithm class.
> >
> > - Created some example code to test with 2D (1 input 1 output), 3D (2
> > inputs 1 output) and 4D (1 input 3 outputs) functions.
> >
> > If there is interest, I would be happy to release the code. What
> would be
> > the best way to do this? Is there some kind of review process?
> >
> > Best regards,
> > Bert
> >
> >
> > > -----Original Message-----
> > > From: Debasish Das [mailto:debasish.das83@gmail.com]
> > > Sent: 27 June 2014 14:02
> > > To: dev@spark.apache.org
> > > Subject: Re: Artificial Neural Network in Spark?
> > >
> > > Look into Powered by Spark page...I found a project there which
> used
> > > autoencoder functions...It's not updated for a long time now !
> > >
> > > On Thu, Jun 26, 2014 at 10:51 PM, Ulanov, Alexander
> > > <alexander.ulanov@hp.com
> > > > wrote:
> > >
> > > > Hi Bert,
> > > >
> > > > It would be extremely interesting. Do you plan to implement
> > > autoencoder as
> > > > well? It would be great to have deep learning in Spark.
> > > >
> > > > Best regards, Alexander
> > > >
> > > > 27.06.2014, в 4:47, "Bert Greevenbosch"
> <Bert.Greevenbosch@huawei.com>
> > > > написал(а):
> > > >
> > > > > Hello all,
> > > > >
> > > > > I was wondering whether Spark/mllib supports Artificial Neural
> > > Networks
> > > > (ANNs)?
> > > > >
> > > > > If not, I am currently working on an implementation of it. I
> re-use
> > > the
> > > > code for linear regression and gradient descent as much as
> possible.
> > > > >
> > > > > Would the community be interested in such implementation? Or
> maybe
> > > > somebody is already working on it?
> > > > >
> > > > > Best regards,
> > > > > Bert
> > > >
> >
Mime
View raw message