spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Nist <tsind...@gmail.com>
Subject Re: Starting Spark SQL thrift server from within a streaming app
Date Thu, 06 Aug 2015 12:57:41 GMT
Well the creation of a thrift server would be to allow external access to
the data from JDBC / ODBC type connections.  The sparkstreaming-sql
leverages a standard spark sql context and then provides a means of
converting an incoming dstream into a row, look at the MessageToRow trait
in KafkaSource class.

The example, org.apache.spark.sql.streaming.examples.KafkaDDL should make
it clear; I think.

-Todd

On Thu, Aug 6, 2015 at 7:58 AM, Daniel Haviv <
daniel.haviv@veracity-group.com> wrote:

> Thank you Todd,
> How is the sparkstreaming-sql project different from starting a thrift
> server on a streaming app ?
>
> Thanks again.
> Daniel
>
>
> On Thu, Aug 6, 2015 at 1:53 AM, Todd Nist <tsindotg@gmail.com> wrote:
>
>> Hi Danniel,
>>
>> It is possible to create an instance of the SparkSQL Thrift server,
>> however seems like this project is what you may be looking for:
>>
>> https://github.com/Intel-bigdata/spark-streamingsql
>>
>> Not 100% sure of your use case is, but you can always convert the data
>> into DF then issue a query against it.  If you want other systems to be
>> able to query it then there are numerous connectors to  store data into
>> Hive, Cassandra, HBase, ElasticSearch, ....
>>
>> To create a instance of a thrift server with its own SQL Context you
>> would do something like the following:
>>
>> import org.apache.spark.{SparkConf, SparkContext}
>>
>> import org.apache.spark.sql.hive.HiveContext
>> import org.apache.spark.sql.hive.HiveMetastoreTypes._
>> import org.apache.spark.sql.types._
>> import org.apache.spark.sql.hive.thriftserver._
>>
>>
>> object MyThriftServer {
>>
>>   val sparkConf = new SparkConf()
>>     // master is passed to spark-submit, but could also be specified explicitely
>>     // .setMaster(sparkMaster)
>>     .setAppName("My ThriftServer")
>>     .set("spark.cores.max", "2")
>>   val sc = new SparkContext(sparkConf)
>>   val  sparkContext  =  sc
>>   import  sparkContext._
>>   val  sqlContext  =  new  HiveContext(sparkContext)
>>   import  sqlContext._
>>   import sqlContext.implicits._
>>
>>   makeRDD((1,"hello") :: (2,"world") ::Nil).toDF.cache().registerTempTable("t")
>>
>>   HiveThriftServer2.startWithContext(sqlContext)
>> }
>>
>> Again, I'm not really clear what your use case is, but it does sound like
>> the first link above is what you may want.
>>
>> -Todd
>>
>> On Wed, Aug 5, 2015 at 1:57 PM, Daniel Haviv <
>> daniel.haviv@veracity-group.com> wrote:
>>
>>> Hi,
>>> Is it possible to start the Spark SQL thrift server from with a
>>> streaming app so the streamed data could be queried as it's goes in ?
>>>
>>> Thank you.
>>> Daniel
>>>
>>
>>
>

Mime
View raw message