sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Boglarka Egyed <b...@cloudera.com>
Subject Re: sqoop hbase incremental import - Sqoop 1.4.6
Date Wed, 01 Mar 2017 10:38:46 GMT
Hi Jilani,

This is an example: SQOOP-3053
<https://issues.apache.org/jira/browse/SQOOP-3053> with the review
<https://reviews.apache.org/r/54206/> linked. Please make your changes on
trunk as it will be used to cut the future release so your patch definitely
needs to be be able to apply on it.

Thanks,
Bogi

On Wed, Mar 1, 2017 at 3:46 AM, Jilani Shaik <jilani2423@gmail.com> wrote:

> Hi Bogi,
>
> Can you provide me sample Jira tickets and Review requests similar to
> this, to proceed further.
>
> I applied the code changes from sqoop git from this branch
> "sqoop-release-1.4.6-rc0", If you suggest right branch I will take the code
> from there and apply the changes before submit review for request.
>
> Thanks,
> Jilani
>
> On Mon, Feb 27, 2017 at 3:05 AM, Boglarka Egyed <bogi@cloudera.com> wrote:
>
>> Hi Jilani,
>>
>> To get your change committed please do the following:
>> * Open a JIRA ticket for your change in Apache's JIRA system
>> <https://issues.apache.org/jira/browse/SQOOP/> for project Sqoop
>> * Create a review request at Apache's review board
>> <https://reviews.apache.org/r/> for project Sqoop and link it to the JIRA
>>
>> ticket
>>
>> Please consider the guidelines below:
>>
>> Review board
>> * Summary: generate your summary using the issue's jira key + jira title
>> * Groups: add the relevant group so everyone on the project will know
>> about
>> your patch (Sqoop)
>> * Bugs: add the issue's jira key so it's easy to navigate to the jira side
>> * Repository: sqoop-trunk for Sqoop1 or sqoop-sqoop2 for Sqoop2
>> * And as soon as the patch gets committed, it's very useful for the
>> community if you close the review and mark it as "Submitted" at the Review
>> board. The button to do this is top right at your own tickets, right next
>> to  the Download Diff button.
>>
>> Jira
>> * Link: please add the link of the review as an external/web link so it's
>> easy to navigate to the reviews side
>> * Status: mark it as "patch available"
>>
>> Sqoop community will receive emails about your new ticket and review
>> request and will review your change.
>>
>> Thanks,
>> Bogi
>>
>>
>> On Sat, Feb 25, 2017 at 2:14 AM, Jilani Shaik <jilani2423@gmail.com>
>> wrote:
>>
>> > Do we have any update?
>> >
>> > I did checkout of the 1.4.6 code and done code changes to achieve this
>> and
>> > tested in cluster and it is working as expected. Is there a way I can
>> > contribute this as a patch and then the committers can validate further
>> and
>> > suggest if any changes required to move further. Please suggest the
>> > approach.
>> >
>> > Thanks,
>> > Jilani
>> >
>> > On Sun, Feb 5, 2017 at 10:41 PM, Jilani Shaik <jilani2423@gmail.com>
>> > wrote:
>> >
>> > > Hi Liz,
>> > >
>> > > lets say we inserted data in a table with initial import, that looks
>> like
>> > > this in hbase shell
>> > >
>> > >  1                                     column=pay:amount,
>> > > timestamp=1485129654025, value=4.99
>> > >  1                                     column=pay:customer_id,
>> > > timestamp=1485129654025, value=1
>> > >  1                                     column=pay:last_update,
>> > > timestamp=1485129654025, value=2017-01-23 05:29:09.0
>> > >  1                                     column=pay:payment_date,
>> > > timestamp=1485129654025, value=2005-05-25 11:30:37.0
>> > >  1                                     column=pay:rental_id,
>> > > timestamp=1485129654025, value=573
>> > >  1                                     column=pay:staff_id,
>> > > timestamp=1485129654025, value=1
>> > >  10                                    column=pay:amount,
>> > > timestamp=1485129504390, value=5.99
>> > >  10                                    column=pay:customer_id,
>> > > timestamp=1485129504390, value=1
>> > >  10                                    column=pay:last_update,
>> > > timestamp=1485129504390, value=2006-02-15 22:12:30.0
>> > >  10                                    column=pay:payment_date,
>> > > timestamp=1485129504390, value=2005-07-08 03:17:05.0
>> > >  10                                    column=pay:rental_id,
>> > > timestamp=1485129504390, value=4526
>> > >  10                                    column=pay:staff_id,
>> > > timestamp=1485129504390, value=2
>> > >
>> > >
>> > > now assume that in source rental_id becomes NULL for rowkey "1", and
>> then
>> > > we are doing incremental import into HBase. With current import the
>> final
>> > > HBase data after incremental import will look like this.
>> > >
>> > >  1                                     column=pay:amount,
>> > > timestamp=1485129654025, value=4.99
>> > >  1                                     column=pay:customer_id,
>> > > timestamp=1485129654025, value=1
>> > >  1                                     column=pay:last_update,
>> > > timestamp=1485129654025, value=2017-02-05 05:29:09.0
>> > >  1                                     column=pay:payment_date,
>> > > timestamp=1485129654025, value=2005-05-25 11:30:37.0
>> > >  1                                     column=pay:rental_id,
>> > > timestamp=1485129654025, value=573
>> > >  1                                     column=pay:staff_id,
>> > > timestamp=1485129654025, value=1
>> > >  10                                    column=pay:amount,
>> > > timestamp=1485129504390, value=5.99
>> > >  10                                    column=pay:customer_id,
>> > > timestamp=1485129504390, value=1
>> > >  10                                    column=pay:last_update,
>> > > timestamp=1485129504390, value=2017-02-05 05:12:30.0
>> > >  10                                    column=pay:payment_date,
>> > > timestamp=1485129504390, value=2005-07-08 03:17:05.0
>> > >  10                                    column=pay:rental_id,
>> > > timestamp=1485129504390, value=126
>> > >  10                                    column=pay:staff_id,
>> > > timestamp=1485129504390, value=2
>> > >
>> > >
>> > >
>> > > As source column "rental_id" becomes NULL for rowkey "1", the final
>> HBase
>> > > should not have the "rental_id" for this rowkey "1". I am expecting
>> below
>> > > data for these rowkeys.
>> > >
>> > >
>> > >  1                                     column=pay:amount,
>> > > timestamp=1485129654025, value=4.99
>> > >  1                                     column=pay:customer_id,
>> > > timestamp=1485129654025, value=1
>> > >  1                                     column=pay:last_update,
>> > > timestamp=1485129654025, value=2017-02-05 05:29:09.0
>> > >  1                                     column=pay:payment_date,
>> > > timestamp=1485129654025, value=2005-05-25 11:30:37.0
>> > >  1                                     column=pay:staff_id,
>> > > timestamp=1485129654025, value=1
>> > >  10                                    column=pay:amount,
>> > > timestamp=1485129504390, value=5.99
>> > >  10                                    column=pay:customer_id,
>> > > timestamp=1485129504390, value=1
>> > >  10                                    column=pay:last_update,
>> > > timestamp=1485129504390, value=2017-02-05 05:12:30.0
>> > >  10                                    column=pay:payment_date,
>> > > timestamp=1485129504390, value=2005-07-08 03:17:05.0
>> > >  10                                    column=pay:rental_id,
>> > > timestamp=1485129504390, value=126
>> > >  10                                    column=pay:staff_id,
>> > > timestamp=1485129504390, value=2
>> > >
>> > >
>> > > Please let me know if anything required further.
>> > >
>> > >
>> > > Thanks,
>> > > Jilani
>> > >
>> > > On Tue, Jan 31, 2017 at 3:38 AM, Erzsebet Szilagyi <
>> > > liz.szilagyi@cloudera.com> wrote:
>> > >
>> > >> Hi Jilani,
>> > >> I'm not sure I completely understand what you are trying to do. Could
>> > you
>> > >> give us some examples with e.g. 4 columns and 2 rows of example data
>> > >> showing the changes that happen compared to the changes you'd like
to
>> > see?
>> > >> Thanks,
>> > >> Liz
>> > >>
>> > >> On Tue, Jan 31, 2017 at 5:18 AM, Jilani Shaik <jilani2423@gmail.com>
>> > >> wrote:
>> > >>
>> > >> >
>> > >> > Please help in resolving the issue, I am going through source
code
>> > some
>> > >> > how the required nature is missing, But not sure is it for some
>> reason
>> > >> we
>> > >> > avoided this nature.
>> > >> >
>> > >> > Provide me some suggestions how to go with this scenario.
>> > >> >
>> > >> > Thanks,
>> > >> > Jilani
>> > >> >
>> > >> > On Sun, Jan 22, 2017 at 6:45 PM, Jilani Shaik <
>> jilani2423@gmail.com>
>> > >> > wrote:
>> > >> >
>> > >> >> Hi,
>> > >> >>
>> > >> >> We have a scenario where we are importing data into HBase
with
>> sqoop
>> > >> >> incremental import.
>> > >> >>
>> > >> >> Lets say we imported a table and later source table got updated
>> for
>> > >> some
>> > >> >> columns as null values for some rows. Then while doing incremental
>> > >> import
>> > >> >> as per HBase these columns should not be there in HBase table.
But
>> > >> right
>> > >> >> now these columns will be as it is available with previous
values.
>> > >> >>
>> > >> >> Is there any fix to overcome this issue?
>> > >> >>
>> > >> >>
>> > >> >> Thanks,
>> > >> >> Jilani
>> > >> >>
>> > >> >
>> > >> >
>> > >>
>> > >
>> > >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message