sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anna Szonyi <szo...@cloudera.com>
Subject Re: Query _ Sqoop on EMR
Date Mon, 06 Feb 2017 20:52:07 GMT
Hi Sneh,

Currently the exclude tables is implemented with contains on the array of
tables:
for (String tableName : tables) {

if (excludes.contains(tableName)) {

System.out.println("Skipping table: " + tableName);

}
...

So it currently doesn't work, however adding support for some sort of
wildcard wouldn't be too difficult.
If this is something you need, it might make sense to create a jira
<https://issues.apache.org/jira/browse/SQOOP/> for it, with your usecase.

Thanks,
Anna



On Sun, Feb 5, 2017 at 8:34 PM, Sneh <shanker.sneh@treebohotels.com> wrote:

> Hi Liz,
>
> I tried running the following command (create a job and then exec) to
> incremental fetch data to S3 (on AWS EMR cluster with EMRFS consistent
> view).
> sqoop job --create incre_reservation -- import --connect
> "jdbc:postgresql://rds-replica-hmssync.XXX.rds.amazonaws.com/hms"
> --username XXX --password XXX --table reservationbooking --incremental
> lastmodified --check-column modified_at --target-dir
> "s3://platform-poc/sqoop/reservation/incre"
>
> The error which I get says that FS should be HDFS and not S3.
> I came up with *alternate* approach to "delta fetch" the data to HDFS and
> then run merge command.
>
> I wanted to check if the "hop" to HDFS can be saved and direct merge could
> happen at S3.
>
> I got an another question, unrelated to the above:
> -> Is there a way I can use wildcards to exclude tables (without
> specifying the exact table names) while importing all the tables?
>
> Thanks for your time!
>
>
> Wishes,
> Sneh
> 8884383482 <(888)%20438-3482>
>
> On Fri, Feb 3, 2017 at 5:24 PM, Erzsebet Szilagyi <
> liz.szilagyi@cloudera.com> wrote:
>
>> Hi Sneh,
>> Could you give us a sample command that you are trying to run?
>> Thanks,
>> Liz
>>
>> On Thu, Jan 19, 2017 at 1:36 PM, Sneh <shanker.sneh@treebohotels.com>
>> wrote:
>>
>>> Dear Sqoop users,
>>>
>>> I've spawned an EMR cluster with Sqoop 1.4.6 and trying to "increment
>>> fetch" data from RDS to S3.
>>> The error I get is that FS should be HDFS and not S3.
>>>
>>> My EMR cluster is enabled for EMRFS consistent view.
>>> I am trying to build a pipeline from RDS to S3. Need help in direction
>>> to how to proceed when increment Sqoop job is unable to write to S3.
>>>
>>> Please help!
>>>
>>>
>>> Wishes,
>>> Sneh
>>> 8884383482 <(888)%20438-3482>
>>>
>>>
>>> <https://s3-ap-southeast-1.amazonaws.com/treebo-email/Great+Rates/sign.jpg>
>>
>>
>>
>>
>> --
>> Erzsebet Szilagyi
>> Software Engineer
>> [image: www.cloudera.com] <http://www.cloudera.com>
>>
>
>
> <https://s3-ap-southeast-1.amazonaws.com/treebo-email/Great+Rates/sign.jpg>

Mime
View raw message