spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joaquin Alzola <Joaquin.Alz...@lebara.com>
Subject JAr files into python3
Date Sun, 03 Jul 2016 20:01:27 GMT
HI List,


I have the following script which will be used in Spark.



#!/usr/bin/env python3

from pyspark_cassandra import CassandraSparkContext, Row

from pyspark import SparkContext, SparkConf

from pyspark.sql import SQLContext

import os



os.environ['CLASSPATH']="/mnt/spark/lib"



conf = SparkConf().setAppName("test").setMaster("spark://192.168.23.31:7077").set("spark.cassandra.connection.host",
"192.168.23.31")

sc = CassandraSparkContext(conf=conf) sqlContext = SQLContext(sc)

df = sqlContext.read.format("org.apache.spark.sql.cassandra").options(keyspace="lebara_diameter_codes",
table="nl_lebara_diameter_codes").load()

list = df.select("errorcode2001").where("errorcode2001 > 1200").collect()

list2 = df.select("date").collect()

print([i for i in list[0]])

print(type(list[0]))



The error that it throws is the following one (which is logical because I do not load the
jar files):

py4j.protocol.Py4JJavaError: An error occurred while calling o29.load.

: java.lang.ClassNotFoundException: Failed to find data source: org.apache.spark.sql.cassandra.
Please find packages at http://spark-packages.org



Is there a way to load those jar files into python or the classpath when calling sqlContext.read.format("org.apache.spark.sql.cassandra")?



Or on the other hand I have to create python scripts with #!/usr/bin/env pyspark?



BR



Joaqun



This email is confidential and may be subject to privilege. If you are not the intended recipient,
please do not copy or disclose its content but contact the sender immediately upon receipt.

Mime
View raw message