spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gourav Sengupta <gourav.sengupta.develo...@gmail.com>
Subject Re: Petastorm vs horovod vs tensorflowonspark vs spark_tensorflow_distributor
Date Mon, 07 Jun 2021 20:02:00 GMT
Hi Sean,

thank you so much for your kind response :)

Regards,
Gourav Sengupta

On Sat, Jun 5, 2021 at 8:00 PM Sean Owen <srowen@gmail.com> wrote:

> All of these tools are reasonable choices. I don't think the Spark project
> itself has a view on what works best. These things do different things. For
> example petastorm is not a training framework, but a way to feed data to a
> distributed DL training process on Spark. For what it's worth, Databricks
> ships Horovod and Petastorm, but that doesn't mean the other projects are
> second-class.
>
> On Tue, Jun 1, 2021 at 4:59 PM Gourav Sengupta <
> gourav.sengupta.developer@gmail.com> wrote:
>
>> Dear TD, Matei, Michael, Reynold,
>>
>> I hope all of you and your loved ones are staying safe and doing well.
>>
>> as a member of the community the direction from the SPARK mentors is
>> getting to be a bit confusing for me and I was wondering if I can seek your
>> help.
>>
>> We have to make long term decisions which is aligned with the open source
>> SPARK compatibility and directions and it will be wonderful to know what is
>> the most dependable route to get data from SPARK to tensorflow, is it:
>> 1. petastorm
>> 2. horovod
>> 3. tensorflowonspark
>> 4. spark_tensorflow_distributor
>> or something else.
>>
>>
>> Any comments from you will be super useful.
>>
>> If I am not wrong, seamless integration between SPARK to tensorflow/
>> pytorch was one of the most exciting visions of SPARK 3.x
>>
>> While using SPARK ML has its own favourite space, I think that tensorflow
>> and pytorch will see a lot of focused development as well.
>>
>>
>> Regards,
>> Gourav Sengupta
>>
>

Mime
View raw message