spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohammed Guller <>
Subject RE: Creating a front-end for output from Spark/PySpark
Date Wed, 26 Nov 2014 01:05:20 GMT
Two options that I can think of:

1)      Use the Spark SQL Thrift/JDBC server.

2)      Develop a web app using some framework such as Play and expose a set of REST APIs
for sending queries. Inside your web app backend, you initialize the Spark SQL context only
once when your app initializes. Then you use that context for executing queries sent using
your REST API.


From: Alaa Ali []
Sent: Sunday, November 23, 2014 12:37 PM
Subject: Creating a front-end for output from Spark/PySpark

Hello. Okay, so I'm working on a project to run analytic processing using Spark or PySpark.
Right now, I connect to the shell and execute my commands. The very first part of my commands
is: create an SQL JDBC connection and cursor to pull from Apache Phoenix, do some processing
on the returned data, and spit out some output. I want to create a web "gui" tool kind of
a thing where I play around with what SQL query is executed for my analysis.

I know that I can write my whole Spark program and use spark-submit and have it accept and
argument to be the SQL query I want to execute, but this means that every time I submit: an
SQL connection will be created, query ran, processing done, output printed, program closes
and SQL connection closes, and then the whole thing repeats if I want to do another query
right away. That will probably cause it to be very slow. Is there a way where I can somehow
have the SQL connection "working" in the backend for example, and then all I have to do is
supply a query from my GUI tool where it then takes it, runs it, displays the output? I just
want to know the big picture and a broad overview of how would I go about doing this and what
additional technology to use and I'll dig up the rest.

Alaa Ali
View raw message