spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vikas Garg <sperry...@gmail.com>
Subject Re: Learning Spark
Date Fri, 05 Jul 2019 09:39:34 GMT
Is there any disadvantage of using Python? I have gone through multiple
articles which says that Python has advantages over Scala.

Scala is super fast in comparison but Python has more pre-built libraries
and options for analytics.

Still should I go with Scala?

On Fri, 5 Jul 2019 at 13:07, Kurt Fehlhauer <kfehlhau@gmail.com> wrote:

> Since you are a data engineer I would start by learning Scala. The parts
> of Scala you would need to learn are pretty basic. Start with the examples
> on the Spark website, which gives examples in multiple languages. Think of
> Scala as a typed version of Python. You will find that the error messages
> tend to be much more meaningful in Scala because that is the native
> language of Spark. If you don’t want to to install the JVM and Scala, I
> highly recommend Databricks community edition as a place to start.
>
> On Thu, Jul 4, 2019 at 11:22 PM Vikas Garg <sperry.it@gmail.com> wrote:
>
>> I am currently working as a data engineer and I am working on Power BI,
>> SSIS (ETL Tool). For learning purpose, I have done the setup PySpark and
>> also able to run queries through Spark on multi node cluster DB (I am using
>> Vertica DB and later will move on HDFS or SQL Server).
>>
>> I have good knowledge of Python also.
>>
>> On Fri, 5 Jul 2019 at 10:32, Kurt Fehlhauer <kfehlhau@gmail.com> wrote:
>>
>>> Are you a data scientist or data engineer?
>>>
>>>
>>> On Thu, Jul 4, 2019 at 10:34 PM Vikas Garg <sperry.it@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I am new Spark learner. Can someone guide me with the strategy towards
>>>> getting expertise in PySpark.
>>>>
>>>> Thanks!!!
>>>>
>>>

Mime
View raw message