spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mich Talebzadeh <mich.talebza...@gmail.com>
Subject Re: call a mysql stored procedure from spark
Date Mon, 15 Aug 2016 15:02:32 GMT
Well that is not the best way as you have to wait for RDBMS to process and
populate the temp table.

A more sound way would be to write a shell script to talk to RDBMS first
and creates and populates that table.

Once ready the same shell script can kick off Spark job to read the temp
table which is ready to be mode.

What you are doing a basic ETL and that temp table could be a just staging
table.

The advantage of this method is that you can be sure that data is ready
before opening the JDBC connection in Spark.

HTH

Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 15 August 2016 at 15:55, sujeet jog <sujeet.jog@gmail.com> wrote:

> Thanks Michael, Michael,
>
> Ayan
> rightly said, yes this stored procedure is invoked from driver, this
> creates the temporary table is DB, the reason being i want to load some
> specific data after processing it, i do not wish to bring it in spark,
> instead want to keep the processing at DB level,  later once the temp table
> is prepared, i would load it via sparkSQL in the executor to process
> further.
>
>
> On Mon, Aug 15, 2016 at 4:24 AM, ayan guha <guha.ayan@gmail.com> wrote:
>
>> More than technical feasibility, I would ask why to invoke a stored
>> procedure for every row? If not, jdbcRdd is moot point.
>>
>> In case stored procedure should be invoked from driver, it can be easily
>> done. Or at most for each partition, at each executor.
>> On 15 Aug 2016 03:06, "Mich Talebzadeh" <mich.talebzadeh@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> The link deals with JDBC and states:
>>>
>>> [image: Inline images 1]
>>>
>>> So it is only SQL. It lacks functionality on Stored procedures with
>>> returning result set.
>>>
>>> This is on an Oracle table
>>>
>>> scala>  var _ORACLEserver = "jdbc:oracle:thin:@rhes564:1521:mydb12"
>>> _ORACLEserver: String = jdbc:oracle:thin:@rhes564:1521:mydb12
>>> scala>  var _username = "scratchpad"
>>> _username: String = scratchpad
>>> scala> var _password = "xxxxxxx"
>>> _password: String = oracle
>>>
>>> scala> val s = HiveContext.read.format("jdbc").options(
>>>      | Map("url" -> _ORACLEserver,
>>>      | *"dbtable" -> "exec weights_sp",*
>>>      | "user" -> _username,
>>>      | "password" -> _password)).load
>>> java.sql.SQLSyntaxErrorException: ORA-00942: table or view does not
>>> exist
>>>
>>>
>>> and that stored procedure exists in Oracle
>>>
>>> scratchpad@MYDB12.MICH.LOCAL> desc weights_sp
>>> PROCEDURE weights_sp
>>>
>>>
>>> HTH
>>>
>>>
>>>
>>>
>>>
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>> On 14 August 2016 at 17:42, Michael Armbrust <michael@databricks.com>
>>> wrote:
>>>
>>>> As described here
>>>> <http://spark.apache.org/docs/latest/sql-programming-guide.html#jdbc-to-other-databases>,
>>>> you can use the DataSource API to connect to an external database using
>>>> JDBC.  While the dbtable option is usually just a table name, it can
>>>> also be any valid SQL command that returns a table when enclosed in
>>>> (parentheses).  I'm not certain, but I'd expect you could use this feature
>>>> to invoke a stored procedure and return the results as a DataFrame.
>>>>
>>>> On Sat, Aug 13, 2016 at 10:40 AM, sujeet jog <sujeet.jog@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Is there a way to call a stored procedure using spark ?
>>>>>
>>>>>
>>>>> thanks,
>>>>> Sujeet
>>>>>
>>>>
>>>>
>>>
>

Mime
View raw message