spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Nerothin <jasonnerot...@gmail.com>
Subject Re: How to extract data in parallel from RDBMS tables
Date Tue, 02 Apr 2019 19:18:56 GMT
I can *imagine* writing some sort of DataframeReader-generation tool, but
am not aware of one that currently exists.

On Tue, Apr 2, 2019 at 13:08 Surendra , Manchikanti <
surendra.manchikanti@gmail.com> wrote:

>
> Looking for a generic solution, not for a specific DB or number of tables.
>
>
> On Fri, Mar 29, 2019 at 5:04 AM Jason Nerothin <jasonnerothin@gmail.com>
> wrote:
>
>> How many tables? What DB?
>>
>> On Fri, Mar 29, 2019 at 00:50 Surendra , Manchikanti <
>> surendra.manchikanti@gmail.com> wrote:
>>
>>> Hi Jason,
>>>
>>> Thanks for your reply, But I am looking for a way to parallelly extract
>>> all the tables in a Database.
>>>
>>>
>>> On Thu, Mar 28, 2019 at 2:50 PM Jason Nerothin <jasonnerothin@gmail.com>
>>> wrote:
>>>
>>>> Yes.
>>>>
>>>> If you use the numPartitions option, your max parallelism will be that
>>>> number. See also: partitionColumn, lowerBound, and upperBound
>>>>
>>>> https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html
>>>>
>>>> On Wed, Mar 27, 2019 at 23:06 Surendra , Manchikanti <
>>>> surendra.manchikanti@gmail.com> wrote:
>>>>
>>>>> Hi All,
>>>>>
>>>>> Is there any way to copy all the tables in parallel from RDBMS using
>>>>> Spark? We are looking for a functionality similar to Sqoop.
>>>>>
>>>>> Thanks,
>>>>> Surendra
>>>>>
>>>>> --
>>>> Thanks,
>>>> Jason
>>>>
>>> --
>> Thanks,
>> Jason
>>
> --
Thanks,
Jason

Mime
View raw message