hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sahil Takiar (JIRA)" <>
Subject [jira] [Commented] (HIVE-18533) Add option to use InProcessLauncher to submit spark jobs
Date Fri, 27 Apr 2018 16:01:00 GMT


Sahil Takiar commented on HIVE-18533:

[~lirui] could you take a look? Below is a brief description of the change. RB:

* Added the ability to launch jobs via Spark's {{InProcessLauncher}} rather than invoking
* Users can pick which launcher they want to use, by default the {{spark-submit}} launcher
is used
* Renamed {{SparkClientImpl}} to {{AbstractSparkClient}} it contains the all the common logic
between the two launchers
** {{AbstractSparkClient}} has two subclasses: {{SparkLauncherSparkClient}} which uses the
{{InProcessLaucher}} and {{SparkSubmitSparkClient}} which uses {{spark-submit}}
** The changes to {{SparkClientImpl}} are mostly just re-factoring, I did my best to ensure
there are no logic changes; the code is now mostly split between {{AbstractSparkClient}} and
*** The biggest change in logic is that now {{SparkSubmitSparkClient#startDriver}} returns
a {{Future}} object instead of a {{Thread}} object
** {{AbstractSparkClient}} has a number of {{abstract}} methods that decide how certain configuration
options need to be set - e.g. how to add jars, specify the keytab / principal, etc.
** Its main method is {{launchDriver}} which specifies how to actually launcher the Spark
app, it returns a {{Future}} object which is used to monitor the state of the Spark app
* {{SparkLauncherSparkClient}} is essentially a wrapper around {{InProcessLauncher}} and it
contains a custom {{Future}} implementation that monitors the underlying Spark app using the
API's exposed by the {{InProcessLauncher}}
* Added unit tests and a q-test

> Add option to use InProcessLauncher to submit spark jobs
> --------------------------------------------------------
>                 Key: HIVE-18533
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>          Components: Spark
>            Reporter: Sahil Takiar
>            Assignee: Sahil Takiar
>            Priority: Major
>         Attachments: HIVE-18533.1.patch, HIVE-18533.2.patch, HIVE-18533.3.patch, HIVE-18533.4.patch,
HIVE-18533.5.patch, HIVE-18533.6.patch, HIVE-18533.7.patch, HIVE-18533.8.patch
> See discussion in HIVE-16484 for details.
> I think this will help with reducing the amount of time it takes to open a HoS session
+ debuggability (no need launch a separate process to run a Spark app).

This message was sent by Atlassian JIRA

View raw message