okay this is all something which I would disagree with.

Dr. Matei Zaharia created SPARK
Then he and Bill Chambers wrote a book on SPARK recently
He is still the main thinking power behind SPARK (look at his research in Stanford)
The name of the book is "SPARK the definitive guide", its the best ever book and introduction on SPARK.

I have been through several documentation, at least 40 books on SPARK, and nothing even comes close to this book. And also it puts into rest much of arguments around which language to choose.

Scala is better suited to data engineering work. It also has better integration with other components like HBase, Kafka, etc.

Python is great for data scientists as there are more data science libraries available in Python.

Is there any disadvantage of using Python? I have gone through multiple articles which says that Python has advantages over Scala.

Scala is super fast in comparison but Python has more pre-built libraries and options for analytics.

Still should I go with Scala?

Since you are a data engineer I would start by learning Scala. The parts of Scala you would need to learn are pretty basic. Start with the examples on the Spark website, which gives examples in multiple languages. Think of Scala as a typed version of Python. You will find that the error messages tend to be much more meaningful in Scala because that is the native language of Spark. If you don’t want to to install the JVM and Scala, I highly recommend Databricks community edition as a place to start. 

I am currently working as a data engineer and I am working on Power BI, SSIS (ETL Tool). For learning purpose, I have done the setup PySpark and also able to run queries through Spark on multi node cluster DB (I am using Vertica DB and later will move on HDFS or SQL Server).

I have good knowledge of Python also.

Are you a data scientist or data engineer?

I am new Spark learner. Can someone guide me with the strategy towards getting expertise in PySpark.