spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Saisai Shao (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SPARK-1363) Add streaming support for Spark SQL module
Date Mon, 31 Mar 2014 11:09:18 GMT
Saisai Shao created SPARK-1363:
----------------------------------

             Summary: Add streaming support for Spark SQL module
                 Key: SPARK-1363
                 URL: https://issues.apache.org/jira/browse/SPARK-1363
             Project: Spark
          Issue Type: New Feature
          Components: SQL
            Reporter: Saisai Shao


Currently there exists some projects like Pig On Storm, SQL on storm (Squall, SQLstream) that
can query over streaming data, but for Spark Streaming, it is a blank area. It will be a good
feature to add streaming supported SQL to Spark SQL.

>From semantic perspective, DStream is quite alike RDD, they both have join, filter, groupBy
operators and so on, also DStream is backed by RDD, so it is transplant-able and reusable
from existing spark plan.

Also Catalyst has a clear division for each step, we can fully use its parse and logical plan
analysis steps,  with only different physical plan.

So here we propose to add streaming support in Catalyst.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message