From Joaquin Alzola <>
Subject JAr files into python3
Date Sun, 03 Jul 2016 20:01:27 GMT
HI List,

I have the following script which will be used in Spark.

#!/usr/bin/env python3

from pyspark_cassandra import CassandraSparkContext, Row

from pyspark import SparkContext, SparkConf

from pyspark.sql import SQLContext

import os


conf = SparkConf().setAppName("test").setMaster("spark://").set("",

sc = CassandraSparkContext(conf=conf) sqlContext = SQLContext(sc)

df ="org.apache.spark.sql.cassandra").options(keyspace="lebara_diameter_codes",

list ="errorcode2001").where("errorcode2001 > 1200").collect()

list2 ="date").collect()

print([i for i in list[0]])


The error that it throws is the following one (which is logical because I do not load the
jar files):

py4j.protocol.Py4JJavaError: An error occurred while calling o29.load.

: java.lang.ClassNotFoundException: Failed to find data source: org.apache.spark.sql.cassandra.
Please find packages at

Is there a way to load those jar files into python or the classpath when calling"org.apache.spark.sql.cassandra")?

Or on the other hand I have to create python scripts with #!/usr/bin/env pyspark?



