spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manoj Samel <manojsamelt...@gmail.com>
Subject Re: [shark-users] SQL on Spark - Shark or SparkSQL
Date Mon, 31 Mar 2014 04:55:35 GMT
Thanks Matei,

Any thoughts of providing Standalone SharkServer equivalent on SparkSQL?

Manoj


On Sun, Mar 30, 2014 at 7:35 PM, Matei Zaharia <matei.zaharia@gmail.com>wrote:

> Hi Manoj,
>
> At the current time, for drop-in replacement of Hive, it will be best to
> stick with Shark. Over time, Shark will use the Spark SQL backend, but
> should remain deployable the way it is today (including launching the
> SharkServer, using the Hive CLI, etc). Spark SQL is better for accessing
> Hive data within a Spark program though, where its APIs are richer and
> easier to link to than the SharkContext.sql2rdd we had previously provided
> in Shark.
>
> So in a nutshell, if you have a Shark deployment today, or need the
> HiveServer, then going with Shark will be fine and we will switch out the
> backend in a future release (we'll probably create preview of this even
> before we're ready to fully switch). If you just want to run SQL queries or
> load SQL data within a Spark program, try out Spark SQL.
>
> Matei
>
> On Mar 30, 2014, at 4:46 PM, Mayur Rustagi <mayur.rustagi@gmail.com>
> wrote:
>
> +1 Have done a few installations of Shark with customers using Hive, they
> love it. Would be good to maintain compatibility with Metastore & QL till
> we have substantial reason to break off (like BlinkDB).
>
> Mayur Rustagi
> Ph: +1 (760) 203 3257
> http://www.sigmoidanalytics.com
> @mayur_rustagi <https://twitter.com/mayur_rustagi>
>
>
>
> On Sun, Mar 30, 2014 at 2:46 AM, Nicholas Chammas <
> nicholas.chammas@gmail.com> wrote:
>
>> This is a great question. We are in the same position, having not
>> invested in Hive yet and looking at various options for SQL-on-Hadoop.
>>
>>
>> On Sat, Mar 29, 2014 at 9:48 PM, Manoj Samel <manojsameltech@gmail.com>wrote:
>>
>>> Hi,
>>>
>>> In context of the recent Spark SQL announcement (
>>> http://databricks.com/blog/2014/03/26/Spark-SQL-manipulating-structured-data-using-Spark.html
>>> ).
>>>
>>> If there is no existing investment in Hive/Shark, would it be worth
>>> starting a new SQL work using SparkSQL rather than Shark ?
>>>
>>> * It seems Shark SQL core will use more and more of SparkSQL
>>> * From the blog, it seems Shark has baggage from Hive, that is not
>>> needed in this case
>>>
>>> On the other hand, there seems to be two shortcomings of SparkSQL (from
>>> a quick scan of blog and doc)
>>>
>>> * SparkSQL will have less features than Shark/Hive QL, at least for now.
>>> * The standalone SharkServer feature will not be available in SparkSQL.
>>>
>>> Can someone from Databricks shed light on what is the long term roadmap?
>>> It will help in avoiding investing in older/two technologies for work with
>>> no Hive needs.
>>>
>>> Thanks,
>>>
>>> PS: Great work on SparkSQL
>>>
>>>
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "shark-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to shark-users+unsubscribe@googlegroups.com.
> To post to this group, send email to shark-users@googlegroups.com.
> Visit this group at http://groups.google.com/group/shark-users.
> For more options, visit https://groups.google.com/d/optout.
>
>
>

Mime
View raw message