sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jagrut Sharma <jagrutsha...@gmail.com>
Subject Re: Getting upper bound in --incremental mode
Date Thu, 20 Jul 2017 21:16:00 GMT
Hi Markus - The question was that --incremental with --lastmodified option
always takes the current time as the upper bound, and this gets stored as
the --last-value for the next run.

In certain cases, it is desirable that the upper bound should come from the
actual column values, and that should get set for the --last-value for next
run.
-
Jagrut



On Wed, Jul 19, 2017 at 2:56 PM, Markus Kemper <markus@cloudera.com> wrote:

> Hey Jagrut,
>
> Can you elaborate more about the problem you are facing and what you mean
> by (Is this possible to set while running sqoop?).
>
>
> Markus Kemper
> Customer Operations Engineer
> [image: www.cloudera.com] <http://www.cloudera.com>
>
>
> On Wed, Jul 19, 2017 at 5:43 PM, Jagrut Sharma <jagrutsharma@gmail.com>
> wrote:
>
> > Hi Tony - I was under the assumption that append mode will not work for
> > timestamp column. But I gave it a try after your reply, and it works. And
> > it gets the upper bound from the database itself. Thanks.
> >
> > --
> > Jagrut
> >
> > On Wed, Jul 19, 2017 at 12:18 PM, Tony Foerster <tony@phdata.io> wrote:
> >
> >> Does `--incremental append` work for you?
> >>
> >> > You should specify append mode when importing a table where new rows
> >> are continually being added with increasing row id values
> >>
> >> Tony
> >>
> >> > On Jul 19, 2017, at 2:02 PM, Jagrut Sharma <jagrutsharma@gmail.com>
> >> wrote:
> >> >
> >> > Hi all - For --incremental mode with 'lastmodified' option, Sqoop (v
> >> 1.4.2)
> >> > generates a query like:
> >> > WHERE column >= last_modified_time and column < current_time
> >> >
> >> > The --last-value is set to the current_time and gets used for the next
> >> run.
> >> >
> >> > Here, the upper bound is always set to the current_time. In some
> cases,
> >> > this upper bound is required to be taken from the database table
> column
> >> > itself. So, the query is required of the form:
> >> > WHERE column >= last_modified_time and column <
> >> max_time_in_db_table_column
> >> >
> >> > And the --last-value for next run needs to be set as
> >> > the max_time_in_db_table_column (and not the current_time).
> >> >
> >> > Is this possible to set while running sqoop?  If no, is there any
> >> > workaround suggested for this?
> >> >
> >> > Thanks a lot.
> >> > --
> >> > Jagrut
> >>
> >>
> >
>



-- 
Jagrut

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message