sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jagrut Sharma <jagrutsha...@gmail.com>
Subject Re: Getting upper bound in --incremental mode
Date Thu, 20 Jul 2017 22:30:42 GMT
Hi Markus - I'm using Sqoop v1, but with a custom metastore (not the one
that Sqoop provides). My original question was on deriving the --last-value
based on table column values (and not time of job execution).

Thanks.
--
Jagrut

On Thu, Jul 20, 2017 at 2:48 PM, Markus Kemper <markus@cloudera.com> wrote:

> Hey Jagrut,
>
> Are you using the Sqoop1 Metastore job tool (assuming yes)?
> Are you wanting to override the current stored --last-value when executing
> the Sqoop job?
>
>
>
> Markus Kemper
> Customer Operations Engineer
> [image: www.cloudera.com] <http://www.cloudera.com>
>
>
> On Thu, Jul 20, 2017 at 5:16 PM, Jagrut Sharma <jagrutsharma@gmail.com>
> wrote:
>
>> Hi Markus - The question was that --incremental with --lastmodified
>> option always takes the current time as the upper bound, and this gets
>> stored as the --last-value for the next run.
>>
>> In certain cases, it is desirable that the upper bound should come from
>> the actual column values, and that should get set for the --last-value for
>> next run.
>> -
>> Jagrut
>>
>>
>>
>> On Wed, Jul 19, 2017 at 2:56 PM, Markus Kemper <markus@cloudera.com>
>> wrote:
>>
>>> Hey Jagrut,
>>>
>>> Can you elaborate more about the problem you are facing and what you mean
>>> by (Is this possible to set while running sqoop?).
>>>
>>>
>>> Markus Kemper
>>> Customer Operations Engineer
>>> [image: www.cloudera.com] <http://www.cloudera.com>
>>>
>>>
>>>
>>> On Wed, Jul 19, 2017 at 5:43 PM, Jagrut Sharma <jagrutsharma@gmail.com>
>>> wrote:
>>>
>>> > Hi Tony - I was under the assumption that append mode will not work for
>>> > timestamp column. But I gave it a try after your reply, and it works.
>>> And
>>> > it gets the upper bound from the database itself. Thanks.
>>> >
>>> > --
>>> > Jagrut
>>> >
>>> > On Wed, Jul 19, 2017 at 12:18 PM, Tony Foerster <tony@phdata.io>
>>> wrote:
>>> >
>>> >> Does `--incremental append` work for you?
>>> >>
>>> >> > You should specify append mode when importing a table where new
rows
>>> >> are continually being added with increasing row id values
>>> >>
>>> >> Tony
>>> >>
>>> >> > On Jul 19, 2017, at 2:02 PM, Jagrut Sharma <jagrutsharma@gmail.com>
>>> >> wrote:
>>> >> >
>>> >> > Hi all - For --incremental mode with 'lastmodified' option, Sqoop
(v
>>> >> 1.4.2)
>>> >> > generates a query like:
>>> >> > WHERE column >= last_modified_time and column < current_time
>>> >> >
>>> >> > The --last-value is set to the current_time and gets used for the
>>> next
>>> >> run.
>>> >> >
>>> >> > Here, the upper bound is always set to the current_time. In some
>>> cases,
>>> >> > this upper bound is required to be taken from the database table
>>> column
>>> >> > itself. So, the query is required of the form:
>>> >> > WHERE column >= last_modified_time and column <
>>> >> max_time_in_db_table_column
>>> >> >
>>> >> > And the --last-value for next run needs to be set as
>>> >> > the max_time_in_db_table_column (and not the current_time).
>>> >> >
>>> >> > Is this possible to set while running sqoop?  If no, is there any
>>> >> > workaround suggested for this?
>>> >> >
>>> >> > Thanks a lot.
>>> >> > --
>>> >> > Jagrut
>>> >>
>>> >>
>>> >
>>>
>>
>>
>>
>> --
>> Jagrut
>>
>>
>


-- 
Jagrut

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message