Hi Debasish, Alexander, all,
Indeed I found the OpenDL project through the Powered by Spark page. I'll need some time to
look into the code, but on the first sight it looks quite welldeveloped. I'll contact the
author about this too.
My own implementation (in Scala) works for multiple inputs and multiple outputs. It implements
a single hidden layer, the number of nodes in it can be specified.
The implementation is a general ANN implementation. As such, it should be useable for an autoencoder
too, since that is just an ANN with some special input/output constraints.
As said before, the implementation is built upon the linear regression model and gradient
descent implementation. However it did require some tweaks:
 The linear regression model only supports a single output "label" (as Double). Since the
ANN can have multiple outputs, it ignores the "label" attribute, but for training divides
the input vector into two parts, the first part being the genuine input vector, the second
the target output vector.
 The concatenation of input and target output vectors is only internally, the training function
takes as input an RDD with tuples of two Vectors, one for each input and output.
 The GradientDescend optimizer is reused without modification.
 I have made an even simpler updater than the SimpleUpdater, leaving out the division by
the square root of the number of iterations. The SimpleUpdater can also be used, but I created
this simpler one because I like to plot the result every now and then, and then continue the
calculations. For this, I also wrote a training function with as input the weights from the
previous training session.
 I created a ParallelANNModel similar to the LinearRegressionModel.
 I created a new GeneralizedSteepestDescendAlgorithm class similar to the GeneralizedLinearAlgorithm
class.
 Created some example code to test with 2D (1 input 1 output), 3D (2 inputs 1 output) and
4D (1 input 3 outputs) functions.
If there is interest, I would be happy to release the code. What would be the best way to
do this? Is there some kind of review process?
Best regards,
Bert
> Original Message
> From: Debasish Das [mailto:debasish.das83@gmail.com]
> Sent: 27 June 2014 14:02
> To: dev@spark.apache.org
> Subject: Re: Artificial Neural Network in Spark?
>
> Look into Powered by Spark page...I found a project there which used
> autoencoder functions...It's not updated for a long time now !
>
> On Thu, Jun 26, 2014 at 10:51 PM, Ulanov, Alexander
> <alexander.ulanov@hp.com
> > wrote:
>
> > Hi Bert,
> >
> > It would be extremely interesting. Do you plan to implement
> autoencoder as
> > well? It would be great to have deep learning in Spark.
> >
> > Best regards, Alexander
> >
> > 27.06.2014, в 4:47, "Bert Greevenbosch" <Bert.Greevenbosch@huawei.com>
> > написал(а):
> >
> > > Hello all,
> > >
> > > I was wondering whether Spark/mllib supports Artificial Neural
> Networks
> > (ANNs)?
> > >
> > > If not, I am currently working on an implementation of it. I reuse
> the
> > code for linear regression and gradient descent as much as possible.
> > >
> > > Would the community be interested in such implementation? Or maybe
> > somebody is already working on it?
> > >
> > > Best regards,
> > > Bert
> >
