Hi Debasish, Alexander, all,
Indeed I found the OpenDL project through the Powered by Spark page. I'll need some time to look into the code, but on the first sight it looks quite well-developed. I'll contact the author about this too.
My own implementation (in Scala) works for multiple inputs and multiple outputs. It implements a single hidden layer, the number of nodes in it can be specified.
The implementation is a general ANN implementation. As such, it should be useable for an autoencoder too, since that is just an ANN with some special input/output constraints.
As said before, the implementation is built upon the linear regression model and gradient descent implementation. However it did require some tweaks:
- The linear regression model only supports a single output "label" (as Double). Since the ANN can have multiple outputs, it ignores the "label" attribute, but for training divides the input vector into two parts, the first part being the genuine input vector, the second the target output vector.
- The concatenation of input and target output vectors is only internally, the training function takes as input an RDD with tuples of two Vectors, one for each input and output.
- The GradientDescend optimizer is re-used without modification.
- I have made an even simpler updater than the SimpleUpdater, leaving out the division by the square root of the number of iterations. The SimpleUpdater can also be used, but I created this simpler one because I like to plot the result every now and then, and then continue the calculations. For this, I also wrote a training function with as input the weights from the previous training session.
- I created a ParallelANNModel similar to the LinearRegressionModel.
- I created a new GeneralizedSteepestDescendAlgorithm class similar to the GeneralizedLinearAlgorithm class.
- Created some example code to test with 2D (1 input 1 output), 3D (2 inputs 1 output) and 4D (1 input 3 outputs) functions.
If there is interest, I would be happy to release the code. What would be the best way to do this? Is there some kind of review process?
Best regards,
Bert
> -----Original Message-----
> From: Debasish Das [mailto:debasish.das83@gmail.com]
> Sent: 27 June 2014 14:02
> To: dev@spark.apache.org
> Subject: Re: Artificial Neural Network in Spark?
>
> Look into Powered by Spark page...I found a project there which used
> autoencoder functions...It's not updated for a long time now !
>
> On Thu, Jun 26, 2014 at 10:51 PM, Ulanov, Alexander
> > wrote:
>
> > Hi Bert,
> >
> > It would be extremely interesting. Do you plan to implement
> autoencoder as
> > well? It would be great to have deep learning in Spark.
> >
> > Best regards, Alexander
> >
> > 27.06.2014, в 4:47, "Bert Greevenbosch"
> > написал(а):
> >
> > > Hello all,
> > >
> > > I was wondering whether Spark/mllib supports Artificial Neural
> Networks
> > (ANNs)?
> > >
> > > If not, I am currently working on an implementation of it. I re-use
> the
> > code for linear regression and gradient descent as much as possible.
> > >
> > > Would the community be interested in such implementation? Or maybe
> > somebody is already working on it?
> > >
> > > Best regards,
> > > Bert
> >