spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simone Franzini <>
Subject Spark on DSE Cassandra with multiple data centers
Date Wed, 11 May 2016 16:15:41 GMT
I am running Spark on DSE Cassandra with multiple analytics data centers.
It is my understanding that with this setup you should have a CFS file
system for each data center. I was able to create an additional CFS file
system as described here:
I verified that the additional CFS file system is created properly.

I am now following the instructions here to configure Spark on the second
data center to use its own CFS:
However, running:
dse hadoop fs -mkdir <additional_cfs_name>:/spark/events
fails with:
WARN You are going to access CFS keyspace: cfs in data center:
<second_analytics_datacenter>. It will not work because the replication
factor for this keyspace in this data center is 0.
Bad connection to FS. command aborted. exception: UnavailableException()

That is, it appears that the <additional_cfs_name> in the hadoop command is
being ignored and it is trying to connect to cfs: rather than

Anybody else ran into this?

Simone Franzini, PhD

View raw message