spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Reynier González Tejeda <reynie...@gmail.com>
Subject Re: Local spark context on an executor
Date Wed, 22 Mar 2017 12:43:06 GMT
Why are you using spark instead of sqoop?

2017-03-21 21:29 GMT-03:00 ayan guha <guha.ayan@gmail.com>:

> For JDBC to work, you can start spark-submit with appropriate jdbc driver
> jars (using --jars), then you will have the driver available on executors.
>
> For acquiring connections, create a singleton connection per executor. I
> think dataframe's jdbc reader (sqlContext.read.jdbc) already take care of
> it.
>
> Finally, if you want multiple mysql table to be accesses in a single spark
> job, you can create a list of tables and run a map on that list. Something
> like:
>
> def getTable(tablename:String): Dataframe
> def saveTable(d : Dataframe): Unit
>
> val tables = sc.paralleize(<List of Table>)
> tables.map(getTable).map(saveTable)
>
> On Wed, Mar 22, 2017 at 9:41 AM, Shashank Mandil <
> mandil.shashank@gmail.com> wrote:
>
>> I am using spark to dump data from mysql into hdfs.
>> The way I am doing this is by creating a spark dataframe with the
>> metadata of different mysql tables to dump from multiple mysql hosts and
>> then running a map over that data frame to dump each mysql table data into
>> hdfs inside the executor.
>>
>> The reason I want spark context is that I would like to use spark jdbc to
>> be able to read the mysql table and then the spark writer to be able to
>> write to hdfs.
>>
>> Thanks,
>> Shashank
>>
>> On Tue, Mar 21, 2017 at 3:37 PM, ayan guha <guha.ayan@gmail.com> wrote:
>>
>>> What is your use case? I am sure there must be a better way to solve
>>> it....
>>>
>>> On Wed, Mar 22, 2017 at 9:34 AM, Shashank Mandil <
>>> mandil.shashank@gmail.com> wrote:
>>>
>>>> Hi All,
>>>>
>>>> I am using spark in a yarn cluster mode.
>>>> When I run a yarn application it creates multiple executors on the
>>>> hadoop datanodes for processing.
>>>>
>>>> Is it possible for me to create a local spark context (master=local) on
>>>> these executors to be able to get a spark context ?
>>>>
>>>> Theoretically since each executor is a java process this should be
>>>> doable isn't it ?
>>>>
>>>> Thanks,
>>>> Shashank
>>>>
>>>
>>>
>>>
>>> --
>>> Best Regards,
>>> Ayan Guha
>>>
>>
>>
>
>
> --
> Best Regards,
> Ayan Guha
>

Mime
View raw message