spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Dai <jason....@gmail.com>
Subject Re: Deep Learning with Spark, what is your experience?
Date Mon, 06 May 2019 01:34:47 GMT
Hi Riccardo,

Yes, you can run Tensorflow distributed training (and inference) inline
with PySpark; see some examples at
https://github.com/intel-analytics/analytics-zoo/blob/master/pyzoo/zoo/examples/tensorflow/tfpark/estimator_dataset.py
(using
TF Keras API),
https://github.com/intel-analytics/analytics-zoo/blob/master/pyzoo/zoo/examples/tensorflow/tfpark/estimator_dataset.py
(using
TF Estimator API) and
https://github.com/intel-analytics/analytics-zoo/tree/master/pyzoo/zoo/examples/tensorflow/distributed_training
.

For Keras API support in Analytics Zoo, it's a new implementation of Keras
1.2.2 on Spark (using BigDL).

Thanks,
-Jason

On Mon, May 6, 2019 at 5:37 AM Riccardo Ferrari <ferrarir@gmail.com> wrote:

> Thanks everyone, I really appreciate your contributions here.
>
> @Jason, thanks for the references I'll take a look. Quickly checking
> github:
> https://github.com/intel-analytics/analytics-zoo#distributed-tensorflow-and-keras-on-sparkbigdl
> Do I understand correctly I can:
>
>    - Prepare my data with Spark
>    - Define a Tensorflow model
>    - Train it in distributed fashion
>
> When using the Keras API, is it the real Keras with just an adapter layer
> or it si a completely different API that mimic Keras?
>
> @Gurav, I agree that "you should pick the right tool for the job".
>
> The purpose of this discussion is to understand/explore if we really need
> another stack or we can leverage on the existing infrastructure and
> expertise to accomplish the task.
> We currently have some ML jobs and Spark proved to be the perfect fit for
> us. We do know it enough to be confident we can deliver what is asked, it
> scale, it is reslient, it works.
>
> We are starting to evaluate/introduce some DL models, being able to
> leverage on the existing infra it would be a big plus. It is not only
> having to deal with a new set of machines running a different stack (ie
> tensorflow, mxnet, ...) it is everything around it, tuning, managing,
> packing applications, testing and so on. Are reasonable concerns?
>
> Best,
>
> On Sun, May 5, 2019 at 8:06 PM Gourav Sengupta <gourav.sengupta@gmail.com>
> wrote:
>
>> If someone is trying to actually use deep learning algorithms, their
>> focus should be in choosing the technology stack which gives them maximum
>> flexibility to try the nuances of their algorithms.
>>
>> From a personal perspective, I always prefer to use libraries which
>> provides the best flexibility and extensibility in terms of the science/
>> mathematics of the subjects. For example try to open a book on Linear
>> Regression and then try to see whether all the mathematical formulations
>> are available in the SPARK module for regression or not.
>>
>> It is always better to choose a technology that fits into the nuances and
>> perfection of the science, rather than choose a technology and then try to
>> fit the science into it.
>>
>> Regards,
>> Gourav
>>
>> On Sun, May 5, 2019 at 2:23 PM Jason Dai <jason.dai@gmail.com> wrote:
>>
>>> You may find talks from Analytics Zoo users at
>>> https://analytics-zoo.github.io/master/#presentations/; in particular,
>>> some of recent user examples on Analytics Zoo:
>>>
>>>    - Mastercard:
>>>    https://software.intel.com/en-us/articles/deep-learning-with-analytic-zoo-optimizes-mastercard-recommender-ai-service
>>>
>>>    - Azure:
>>>    https://software.intel.com/en-us/articles/use-analytics-zoo-to-inject-ai-into-customer-service-platforms-on-microsoft-azure-part-1
>>>    - CERN:
>>>    https://db-blog.web.cern.ch/blog/luca-canali/machine-learning-pipelines-high-energy-physics-using-apache-spark-bigdl
>>>    - Midea/KUKA:
>>>    https://software.intel.com/en-us/articles/industrial-inspection-platform-in-midea-and-kuka-using-distributed-tensorflow-on-analytics
>>>    - Talroo:
>>>    https://software.intel.com/en-us/articles/talroo-uses-analytics-zoo-and-aws-to-leverage-deep-learning-for-job-recommendation
>>>    <https://software.intel.com/en-us/articles/talroo-uses-analytics-zoo-and-aws-to-leverage-deep-learning-for-job-recommendations>
>>>
>>> Thanks,
>>> -Jason
>>>
>>> On Sun, May 5, 2019 at 6:29 AM Riccardo Ferrari <ferrarir@gmail.com>
>>> wrote:
>>>
>>>> Thank you for your answers!
>>>>
>>>> While it is clear each DL framework can solve the distributed model
>>>> training on their own (some better than others).  Still I see a lot of
>>>> value of having Spark on the ETL/pre-processing part, thus the origin of
my
>>>> question.
>>>> I am trying to avoid to mange multiple stacks/workflows and hoping to
>>>> unify my system. Projects like TensorflowOnSpark or Analytics-Zoo (to name
>>>> couple) feels like they can help, still I really appreciate your comments
>>>> and anyone that could add some value to this discussion. Does anyone have
>>>> experience with them?
>>>>
>>>> Thanks
>>>>
>>>> On Sat, May 4, 2019 at 8:01 PM Pat Ferrel <pat@occamsmachete.com>
>>>> wrote:
>>>>
>>>>> @Riccardo
>>>>>
>>>>> Spark does not do the DL learning part of the pipeline (afaik) so it
>>>>> is limited to data ingestion and transforms (ETL). It therefore is optional
>>>>> and other ETL options might be better for you.
>>>>>
>>>>> Most of the technologies @Gourav mentions have their own scaling based
>>>>> on their own compute engines specialized for their DL implementations,
so
>>>>> be aware that Spark scaling has nothing to do with scaling most of the
DL
>>>>> engines, they have their own solutions.
>>>>>
>>>>> From: Gourav Sengupta <gourav.sengupta@gmail.com>
>>>>> <gourav.sengupta@gmail.com>
>>>>> Reply: Gourav Sengupta <gourav.sengupta@gmail.com>
>>>>> <gourav.sengupta@gmail.com>
>>>>> Date: May 4, 2019 at 10:24:29 AM
>>>>> To: Riccardo Ferrari <ferrarir@gmail.com> <ferrarir@gmail.com>
>>>>> Cc: User <user@spark.apache.org> <user@spark.apache.org>
>>>>> Subject:  Re: Deep Learning with Spark, what is your experience?
>>>>>
>>>>> Try using MxNet and Horovod directly as well (I think that MXNet is
>>>>> worth a try as well):
>>>>> 1.
>>>>> https://medium.com/apache-mxnet/distributed-training-using-apache-mxnet-with-horovod-44f98bf0e7b7
>>>>> 2.
>>>>> https://docs.nvidia.com/deeplearning/dgx/mxnet-release-notes/rel_19-01.html
>>>>> 3. https://aws.amazon.com/mxnet/
>>>>> 4.
>>>>> https://aws.amazon.com/blogs/machine-learning/aws-deep-learning-amis-now-include-horovod-for-faster-multi-gpu-tensorflow-training-on-amazon-ec2-p3-instances/
>>>>>
>>>>>
>>>>> Ofcourse Tensorflow is backed by Google's advertisement team as well
>>>>> https://aws.amazon.com/blogs/machine-learning/scalable-multi-node-training-with-tensorflow/
>>>>>
>>>>>
>>>>> Regards,
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Sat, May 4, 2019 at 10:59 AM Riccardo Ferrari <ferrarir@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi list,
>>>>>>
>>>>>> I am trying to undestand if ti make sense to leverage on Spark as
>>>>>> enabling platform for Deep Learning.
>>>>>>
>>>>>> My open question to you are:
>>>>>>
>>>>>>    - Do you use Apache Spark in you DL pipelines?
>>>>>>    - How do you use Spark for DL? Is it just a stand-alone stage
in
>>>>>>    the workflow (ie data preparation script) or is it  more integrated
>>>>>>
>>>>>> I see a major advantage in leveraging on Spark as a unified
>>>>>> entrypoint, for example you can easily abstract data sources and
leverage
>>>>>> on existing team skills for data pre-processing and training. On
the flip
>>>>>> side you may hit some limitations including supported versions and
so on.
>>>>>> What is your experience?
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>

Mime
View raw message