spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sunita Arvind <sunitarv...@gmail.com>
Subject Seeking advice on realtime querying over JDBC
Date Thu, 02 Jun 2016 17:47:26 GMT
Hi Experts,

We are trying to get a kafka stream ingested in Spark and expose the
registered table over JDBC for querying. Here are some questions:
1. Spark Streaming supports single context per application right? If I have
multiple customers and would like to create a kafka topic for each of them
and 1 streaming context for every topic is this doable? As per the current
spark documentation,
http://spark.apache.org/docs/latest/streaming-programming-guide.html#initializing-streamingcontext
I can have only 1 active streaming context at a time. Is there no way
around that? The use case here is, if I am looking at a 5 min window, the
window should have records for that customer only, which is possible only
by having customer specific streaming context.

2. If I am able to create multiple contexts in this fashion, can I register
them as temp tables in my application and expose them over JDBC. Going by
https://forums.databricks.com/questions/1464/how-to-configure-thrift-server-to-use-a-custom-spa.html,
looks like I can connect the thrift server to a single sparkSQL Context.
Having multiple streaming contexts means I automatically have multiple SQL
contexts?

3. Can I use SQLContext or do I need to have HiveContext in order to see
the tables registered via Spark application through the JDBC?

regards
Sunita

Mime
View raw message