spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dwgw <dwijadas...@gmail.com>
Subject Unable to create direct stream with SSL enabled Kafka cluster
Date Sat, 15 May 2021 14:49:12 GMT
Hi
 I am trying to stream a Kafka topic using createDirectStream().  The Kafka
cluster is SSL enabled. The code for the same is:

***************************

import findspark

findspark.init('/u01/idp/spark')

from pyspark import SparkContext
from pyspark.streaming import StreamingContext
from pyspark.streaming.kafka import KafkaUtils


kafkaParams = {"metadata.broker.list":"kfk-bro2.mydomain.com:9093",
"security.protocol":"ssl", "ssl.key.password":"Password123",
"ssl.keystore.location":"/tmp/keystore.jks",
"ssl.keystore.password":"Password123",
"ssl.truststore.location":"/tmp/truststore.jks",
"ssl.truststore.password":"Password123",
"ssl.endpoint.identification.algorithm":""}



if __name__ == "__main__":

    sc = SparkContext(appName="PythonStreamingReciever")
    ssc = StreamingContext(sc, 30)
    
 
    message = KafkaUtils.createDirectStream(ssc, ["test1_topic"],
kafkaParams)
   
    lines = message.map(lambda x: x[1])
    lines.pprint()

 
    ssc.start()
    ssc.awaitTermination()

***************************

Submitting the python script to the cluster using spark-submit

# spark-submit --master yarn --deploy-mode client --files
/u01/idp/spark/conf/log4j.properties --conf
"spark.executor.extraJavaOptions='-Dlog4j.configuration=log4j.properties'"
--driver-java-options
"-Dlog4j.configuration=file:/u01/idp/spark/conf/log4j.properties" 
--packages
org.apache.spark:spark-streaming-kafka-0-8_2.11:2.4.3,org.apache.spark:spark-sql-kafka-0-10_2.11:2.4.3
streamingKafka.py

But during the execution of the above script, i am getting the following
error.

File "/home/spark/streamingKafka.py", line 23, in <module>
   message = KafkaUtils.createDirectStream(ssc, ["test1_topic"],
kafkaParams)
.........
.........
File "/u01/idp/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py",
line 1257, in __call__
File "/u01/idp/spark/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line
328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling
o43.createDirectStreamWithoutMessageHandler.
........
........

What could be the possible causes of the error ?

I can stream Kafka topic using console consumer and can reach any one of the
broker.

Kafka version: 2.12
Spark version: 2.4.6

Thanks




--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message