spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick McCarthy <>
Subject Building Spark 3.0.0 for Hive 1.2
Date Fri, 10 Jul 2020 13:17:34 GMT
I'm trying to build Spark 3.0.0 for my Yarn cluster, with Hadoop 2.7.3 and
Hive 1.2.1. I downloaded the source and created a runnable dist with

./dev/ --name custom-spark --pip --r --tgz -Psparkr
-Phive-1.2 -Phadoop-2.7 -Pyarn

We're running Spark 2.4.0 in production so I copied the hive-site.xml, and spark-defaults.conf from there.

When I try to create a SparkSession in a normal Python REPL, I get the
following uninformative error. How can I debug this? I can run the
spark-shell and get to a scala prompt with Hive access seemingly without

Python 3.6.3 (default, Apr 10 2018, 16:07:04)[GCC 4.8.3 20140911 (Red
Hat 4.8.3-9)] on linuxType "help", "copyright", "credits" or "license"
for more information.>>> import os>>> import sys>>>
os.environ['SPARK_HOME'] = '/home/pmccarthy/custom-spark-3'>>>
import pyspark>>> from pyspark.sql import SparkSession>>> spark =

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/pmccarthy/custom-spark-3/python/pyspark/sql/",
line 191, in getOrCreate
    session._jsparkSession.sessionState().conf().setConfString(key, value)
  File "/home/pmccarthy/custom-spark-3/python/lib/",
line 1305, in __call__
  File "/home/pmccarthy/custom-spark-3/python/pyspark/sql/",
line 137, in deco
  File "<string>", line 3, in raise_from
pyspark.sql.utils.IllegalArgumentException: <exception str() failed>


*Patrick McCarthy  *

Senior Data Scientist, Machine Learning Engineering


470 Park Ave South, 17th Floor, NYC 10016

View raw message