spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tathagata Das <t...@databricks.com>
Subject Re: Query data in Spark RRD
Date Mon, 23 Feb 2015 08:34:04 GMT
You will have a build a split infrastructure - a front end that takes the
queries from the UI and sends them to the backend, and the backend (running
the Spark Streaming app) will actually run the queries on table created in
the contexts. The RPCs necessary between the frontend and backend will need
to be implemented by you.

On Sat, Feb 21, 2015 at 11:57 PM, Nikhil Bafna <nikhil.bafna@flipkart.com>
wrote:

>
> Yes. As my understanding, it would allow me to write SQLs to query a spark
> context. But, the query needs to be specified within a job & deployed.
>
> What I want is to be able to run multiple dynamic queries specified at
> runtime from a dashboard.
>
>
>
> --
> Nikhil Bafna
>
> On Sat, Feb 21, 2015 at 8:37 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>
>> Have you looked at
>> http://spark.apache.org/docs/1.2.0/api/scala/index.html#org.apache.spark.sql.SchemaRDD
>> ?
>>
>> Cheers
>>
>> On Sat, Feb 21, 2015 at 4:24 AM, Nikhil Bafna <nikhil.bafna@flipkart.com>
>> wrote:
>>
>>>
>>> Hi.
>>>
>>> My use case is building a realtime monitoring system over
>>> multi-dimensional data.
>>>
>>> The way I'm planning to go about it is to use Spark Streaming to store
>>> aggregated count over all dimensions in 10 sec interval.
>>>
>>> Then, from a dashboard, I would be able to specify a query over some
>>> dimensions, which will need re-aggregation from the already computed job.
>>>
>>> My query is, how can I run dynamic queries over data in schema RDDs?
>>>
>>> --
>>> Nikhil Bafna
>>>
>>
>>
>

Mime
View raw message