spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jaonary Rabarisoa <jaon...@gmail.com>
Subject Re: DeepLearning and Spark ?
Date Sat, 10 Jan 2015 08:10:55 GMT
Can someone explain what is the difference between parameter server and
spark ?

There's already an issue on this topic
https://issues.apache.org/jira/browse/SPARK-4590


Another example of DL in Spark essentially based on downpour SDG
http://deepdist.com


On Sat, Jan 10, 2015 at 2:27 AM, Peng Cheng <rhwing@gmail.com> wrote:

> Not if broadcast can only be used between stages. To enable this you have
> to at least make broadcast asynchronous & non-blocking.
>
> On 9 January 2015 at 18:02, Krishna Sankar <ksankar42@gmail.com> wrote:
>
>> I am also looking at this domain. We could potentially use the broadcast
>> capability in Spark to distribute the parameters. Haven't thought thru yet.
>> Cheers
>> <k/>
>>
>> On Fri, Jan 9, 2015 at 2:56 PM, Andrei <faithlessfriend@gmail.com> wrote:
>>
>>> Does it makes sense to use Spark's actor system (e.g. via
>>> SparkContext.env.actorSystem) to create parameter server?
>>>
>>> On Fri, Jan 9, 2015 at 10:09 PM, Peng Cheng <rhwing@gmail.com> wrote:
>>>
>>>> You are not the first :) probably not the fifth to have the question.
>>>> parameter server is not included in spark framework and I've seen all
>>>> kinds of hacking to improvise it: REST api, HDFS, tachyon, etc.
>>>> Not sure if an 'official' benchmark & implementation will be released
>>>> soon
>>>>
>>>> On 9 January 2015 at 10:59, Marco Shaw <marco.shaw@gmail.com> wrote:
>>>>
>>>>> Pretty vague on details:
>>>>>
>>>>>
>>>>> http://www.datasciencecentral.com/m/blogpost?id=6448529%3ABlogPost%3A227199
>>>>>
>>>>>
>>>>> On Jan 9, 2015, at 11:39 AM, Jaonary Rabarisoa <jaonary@gmail.com>
>>>>> wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> DeepLearning algorithms are popular and achieve many state of the art
>>>>> performance in several real world machine learning problems. Currently
>>>>> there are no DL implementation in spark and I wonder if there is an ongoing
>>>>> work on this topics.
>>>>>
>>>>> We can do DL in spark Sparkling water and H2O but this adds an
>>>>> additional software stack.
>>>>>
>>>>> Deeplearning4j seems to implements a distributed version of many
>>>>> popural DL algorithm. Porting DL4j in Spark can be interesting.
>>>>>
>>>>> Google describes an implementation of a large scale DL in this paper
>>>>> http://research.google.com/archive/large_deep_networks_nips2012.html.
>>>>> Based on model parallelism and data parallelism.
>>>>>
>>>>> So, I'm trying to imaging what should be a good design for DL
>>>>> algorithm in Spark ? Spark already have RDD (for data parallelism). Can
>>>>> GraphX be used for the model parallelism (as DNN are generally designed
as
>>>>> DAG) ? And what about using GPUs to do local parallelism (mecanism to
push
>>>>> partition into GPU memory ) ?
>>>>>
>>>>>
>>>>> What do you think about this ?
>>>>>
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Jao
>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message