Can someone explain what is the difference between parameter server and spark ? 

There's already an issue on this topic

Another example of DL in Spark essentially based on downpour SDG

Not if broadcast can only be used between stages. To enable this you have to at least make broadcast asynchronous & non-blocking.

I am also looking at this domain. We could potentially use the broadcast capability in Spark to distribute the parameters. Haven't thought thru yet. 

Does it makes sense to use Spark's actor system (e.g. via SparkContext.env.actorSystem) to create parameter server? 

You are not the first :) probably not the fifth to have the question.
parameter server is not included in spark framework and I've seen all kinds of hacking to improvise it: REST api, HDFS, tachyon, etc.
Not sure if an 'official' benchmark & implementation will be released soon

Pretty vague on details:

DeepLearning algorithms are popular and achieve many state of the art performance in several real world machine learning problems. Currently there are no DL implementation in spark and I wonder if there is an ongoing work on this topics.

We can do DL in spark Sparkling water and H2O but this adds an additional software stack.

Deeplearning4j seems to implements a distributed version of many popural DL algorithm. Porting DL4j in Spark can be interesting.

Google describes an implementation of a large scale DL in this paper Based on model parallelism and data parallelism.

So, I'm trying to imaging what should be a good design for DL algorithm in Spark ? Spark already have RDD (for data parallelism). Can GraphX be used for the model parallelism (as DNN are generally designed as DAG) ? And what about using GPUs to do local parallelism (mecanism to push partition into GPU memory ) ? 

What do you think about this ?