spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Conconscious <conconsci...@gmail.com>
Subject Spark querying C* in Scala
Date Mon, 22 Jan 2018 13:43:10 GMT
Hi list,

I have a Cassandra table with two fields; id bigint, kafka text

My goal is to read only the kafka field (that is a JSON) and infer the
schema

Hi have this skeleton code (not working):

sc.stop
import org.apache.spark._
import com.datastax.spark._
import org.apache.spark.sql.functions.get_json_object

import org.apache.spark.sql.functions.to_json
import org.apache.spark.sql.functions.from_json
import org.apache.spark.sql.types._

val conf = new SparkConf(true)
.set("spark.cassandra.connection.host", "127.0.0.1")
.set("spark.cassandra.auth.username", "cassandra")
.set("spark.cassandra.auth.password", "cassandra")
val sc = new SparkContext(conf)

val sqlContext = new org.apache.spark.sql.SQLContext(sc)
val df = sqlContext.sql("SELECT kafka from table1")
df.printSchema()

I think at least I have two problems; is missing the keyspace, is not
recognizing the table and for sure is not going to infer the schema from
the text field.

I have a working solution for json files, but I can't "translate" this
to Cassandra:

import org.apache.spark.sql.SparkSession
import spark.implicits._
val spark = SparkSession.builder().appName("Spark SQL basic
example").getOrCreate()
val redf = spark.read.json("/usr/local/spark/examples/cqlsh_r.json")
redf.printSchema
redf.count
redf.show
redf.createOrReplaceTempView("clicks")
val clicksDF = spark.sql("SELECT * FROM clicks")
clicksDF.show()

My Spark version is 2.2.1 and Cassandra version is 3.11.1

Thanks in advance



---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message