spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vikas Garg <>
Subject Re: Learning Spark
Date Fri, 05 Jul 2019 09:39:34 GMT
Is there any disadvantage of using Python? I have gone through multiple
articles which says that Python has advantages over Scala.

Scala is super fast in comparison but Python has more pre-built libraries
and options for analytics.

Still should I go with Scala?

On Fri, 5 Jul 2019 at 13:07, Kurt Fehlhauer <> wrote:

> Since you are a data engineer I would start by learning Scala. The parts
> of Scala you would need to learn are pretty basic. Start with the examples
> on the Spark website, which gives examples in multiple languages. Think of
> Scala as a typed version of Python. You will find that the error messages
> tend to be much more meaningful in Scala because that is the native
> language of Spark. If you don’t want to to install the JVM and Scala, I
> highly recommend Databricks community edition as a place to start.
> On Thu, Jul 4, 2019 at 11:22 PM Vikas Garg <> wrote:
>> I am currently working as a data engineer and I am working on Power BI,
>> SSIS (ETL Tool). For learning purpose, I have done the setup PySpark and
>> also able to run queries through Spark on multi node cluster DB (I am using
>> Vertica DB and later will move on HDFS or SQL Server).
>> I have good knowledge of Python also.
>> On Fri, 5 Jul 2019 at 10:32, Kurt Fehlhauer <> wrote:
>>> Are you a data scientist or data engineer?
>>> On Thu, Jul 4, 2019 at 10:34 PM Vikas Garg <> wrote:
>>>> Hi,
>>>> I am new Spark learner. Can someone guide me with the strategy towards
>>>> getting expertise in PySpark.
>>>> Thanks!!!

View raw message