sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Kemper <mar...@cloudera.com>
Subject Re: Getting upper bound in --incremental mode
Date Thu, 20 Jul 2017 21:48:50 GMT
Hey Jagrut,

Are you using the Sqoop1 Metastore job tool (assuming yes)?
Are you wanting to override the current stored --last-value when executing
the Sqoop job?



Markus Kemper
Customer Operations Engineer
[image: www.cloudera.com] <http://www.cloudera.com>


On Thu, Jul 20, 2017 at 5:16 PM, Jagrut Sharma <jagrutsharma@gmail.com>
wrote:

> Hi Markus - The question was that --incremental with --lastmodified option
> always takes the current time as the upper bound, and this gets stored as
> the --last-value for the next run.
>
> In certain cases, it is desirable that the upper bound should come from
> the actual column values, and that should get set for the --last-value for
> next run.
> -
> Jagrut
>
>
>
> On Wed, Jul 19, 2017 at 2:56 PM, Markus Kemper <markus@cloudera.com>
> wrote:
>
>> Hey Jagrut,
>>
>> Can you elaborate more about the problem you are facing and what you mean
>> by (Is this possible to set while running sqoop?).
>>
>>
>> Markus Kemper
>> Customer Operations Engineer
>> [image: www.cloudera.com] <http://www.cloudera.com>
>>
>>
>>
>> On Wed, Jul 19, 2017 at 5:43 PM, Jagrut Sharma <jagrutsharma@gmail.com>
>> wrote:
>>
>> > Hi Tony - I was under the assumption that append mode will not work for
>> > timestamp column. But I gave it a try after your reply, and it works.
>> And
>> > it gets the upper bound from the database itself. Thanks.
>> >
>> > --
>> > Jagrut
>> >
>> > On Wed, Jul 19, 2017 at 12:18 PM, Tony Foerster <tony@phdata.io> wrote:
>> >
>> >> Does `--incremental append` work for you?
>> >>
>> >> > You should specify append mode when importing a table where new rows
>> >> are continually being added with increasing row id values
>> >>
>> >> Tony
>> >>
>> >> > On Jul 19, 2017, at 2:02 PM, Jagrut Sharma <jagrutsharma@gmail.com>
>> >> wrote:
>> >> >
>> >> > Hi all - For --incremental mode with 'lastmodified' option, Sqoop (v
>> >> 1.4.2)
>> >> > generates a query like:
>> >> > WHERE column >= last_modified_time and column < current_time
>> >> >
>> >> > The --last-value is set to the current_time and gets used for the
>> next
>> >> run.
>> >> >
>> >> > Here, the upper bound is always set to the current_time. In some
>> cases,
>> >> > this upper bound is required to be taken from the database table
>> column
>> >> > itself. So, the query is required of the form:
>> >> > WHERE column >= last_modified_time and column <
>> >> max_time_in_db_table_column
>> >> >
>> >> > And the --last-value for next run needs to be set as
>> >> > the max_time_in_db_table_column (and not the current_time).
>> >> >
>> >> > Is this possible to set while running sqoop?  If no, is there any
>> >> > workaround suggested for this?
>> >> >
>> >> > Thanks a lot.
>> >> > --
>> >> > Jagrut
>> >>
>> >>
>> >
>>
>
>
>
> --
> Jagrut
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message