sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jilani Shaik <jilani2...@gmail.com>
Subject Re: sqoop hbase incremental import - Sqoop 1.4.6
Date Fri, 10 Mar 2017 03:42:22 GMT
Hi Bogi,

- Prepared jar using trunk with "jar-all" target

- Copied the jar to /opt/mapr/sqoop/sqoop-1.4.6/

- Moved out existing jar to some other location

- then execute the below command to do import
sqoop import --connect jdbc:mysql://10.0.0.300/database123 --verbose
--username test --password test123$ --table payment -m 2 --hbase-table
/database/demoapp/hbase/payment --column-family pay --hbase-row-key
payment_id --incremental lastmodified --merge-key payment_id --check-column
last_update --last-value '2017-01-08 08:02:05.0'


The same steps I followed for both the jar from trunk code vs 1.4.6 branch
code.

Where are you suggesting the multiple avro jars, is it at the time of jar
preparation or running the command using the jar.


Thanks,
Jilani

On Thu, Mar 9, 2017 at 9:21 AM, Boglarka Egyed <bogi@cloudera.com> wrote:

> Hi Jilani,
>
> I suspect that you have an old version of Avro or even multiple Avro
> versions on your classpath and thus Sqoop uses an older one.
>
> Could you please provide a list of the exact commands you have performed
> so that I can reproduce the issue?
>
> Thanks,
> Bogi
>
> On Thu, Mar 9, 2017 at 2:51 AM, Jilani Shaik <jilani2423@gmail.com> wrote:
>
>> Can some one provide me the pointers what am I missing with trunk vs 1.4.6
>> builds, which is giving some error as mentioned in below mail chain.
>>
>> I did followed the same ant target to prepare jar for both branches, but
>> even though 1.4.6 jar is different to 1.4.7 which is created form trunk.
>>
>> Thanks,
>> Jilani
>>
>>
>> On Wed, Mar 8, 2017 at 3:29 AM, Jilani Shaik <jilani2423@gmail.com>
>> wrote:
>>
>> > Hi Bogi,
>> >
>> > I am getting below error, when I have prepared jar from trunk and try to
>> > do sqoop import with mysql database table and got below exception,
>> where as
>> > similar changes are working with branch 1.4.6.
>> >
>> >
>> > 17/03/08 01:06:25 INFO sqoop.Sqoop: Running Sqoop version:
>> 1.4.7-SNAPSHOT
>> > 17/03/08 01:06:25 DEBUG tool.BaseSqoopTool: Enabled debug logging.
>> > 17/03/08 01:06:25 WARN tool.BaseSqoopTool: Setting your password on the
>> > command-line is insecure. Consider using -P instead.
>> > 17/03/08 01:06:25 DEBUG sqoop.ConnFactory: Loaded manager factory:
>> > org.apache.sqoop.manager.oracle.OraOopManagerFactory
>> > 17/03/08 01:06:25 DEBUG sqoop.ConnFactory: Loaded manager factory:
>> > com.cloudera.sqoop.manager.DefaultManagerFactory
>> > 17/03/08 01:06:25 DEBUG sqoop.ConnFactory: Trying ManagerFactory:
>> > org.apache.sqoop.manager.oracle.OraOopManagerFactory
>> > 17/03/08 01:06:25 DEBUG oracle.OraOopManagerFactory: Data Connector for
>> > Oracle and Hadoop can be called by Sqoop!
>> > 17/03/08 01:06:25 DEBUG sqoop.ConnFactory: Trying ManagerFactory:
>> > com.cloudera.sqoop.manager.DefaultManagerFactory
>> > 17/03/08 01:06:25 DEBUG manager.DefaultManagerFactory: Trying with
>> scheme:
>> > jdbc:mysql:
>> > Exception in thread "main" java.lang.NoClassDefFoundError:
>> > org/apache/avro/LogicalType
>> >         at org.apache.sqoop.manager.DefaultManagerFactory.accept(
>> > DefaultManagerFactory.java:67)
>> >         at org.apache.sqoop.ConnFactory.getManager(ConnFactory.java:184
>> )
>> >         at org.apache.sqoop.tool.BaseSqoopTool.init(
>> > BaseSqoopTool.java:270)
>> >         at org.apache.sqoop.tool.ImportTool.init(ImportTool.java:97)
>> >         at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:617)
>> >         at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
>> >         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>> >         at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
>> >         at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
>> >         at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
>> >         at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
>> > Caused by: java.lang.ClassNotFoundException:
>> org.apache.avro.LogicalType
>> >         at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>> >         at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>> >         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:
>> 331)
>> >         at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>> >         ... 11 more
>> >
>> > Please let me know what is missing and how to resolve this exception,
>> Let
>> > me know if you need further details.
>> >
>> > Thanks,
>> > Jilani
>> >
>> > On Wed, Mar 1, 2017 at 4:38 AM, Boglarka Egyed <bogi@cloudera.com>
>> wrote:
>> >
>> >> Hi Jilani,
>> >>
>> >> This is an example: SQOOP-3053
>> >> <https://issues.apache.org/jira/browse/SQOOP-3053> with the review
>> >> <https://reviews.apache.org/r/54206/> linked. Please make your
>> changes on
>> >> trunk as it will be used to cut the future release so your patch
>> >> definitely
>> >> needs to be be able to apply on it.
>> >>
>> >> Thanks,
>> >> Bogi
>> >>
>> >> On Wed, Mar 1, 2017 at 3:46 AM, Jilani Shaik <jilani2423@gmail.com>
>> >> wrote:
>> >>
>> >> > Hi Bogi,
>> >> >
>> >> > Can you provide me sample Jira tickets and Review requests similar
to
>> >> > this, to proceed further.
>> >> >
>> >> > I applied the code changes from sqoop git from this branch
>> >> > "sqoop-release-1.4.6-rc0", If you suggest right branch I will take
>> the
>> >> code
>> >> > from there and apply the changes before submit review for request.
>> >> >
>> >> > Thanks,
>> >> > Jilani
>> >> >
>> >> > On Mon, Feb 27, 2017 at 3:05 AM, Boglarka Egyed <bogi@cloudera.com>
>> >> wrote:
>> >> >
>> >> >> Hi Jilani,
>> >> >>
>> >> >> To get your change committed please do the following:
>> >> >> * Open a JIRA ticket for your change in Apache's JIRA system
>> >> >> <https://issues.apache.org/jira/browse/SQOOP/> for project
Sqoop
>> >> >> * Create a review request at Apache's review board
>> >> >> <https://reviews.apache.org/r/> for project Sqoop and link
it to
>> the
>> >> JIRA
>> >> >>
>> >> >> ticket
>> >> >>
>> >> >> Please consider the guidelines below:
>> >> >>
>> >> >> Review board
>> >> >> * Summary: generate your summary using the issue's jira key + jira
>> >> title
>> >> >> * Groups: add the relevant group so everyone on the project will
>> know
>> >> >> about
>> >> >> your patch (Sqoop)
>> >> >> * Bugs: add the issue's jira key so it's easy to navigate to the
>> jira
>> >> side
>> >> >> * Repository: sqoop-trunk for Sqoop1 or sqoop-sqoop2 for Sqoop2
>> >> >> * And as soon as the patch gets committed, it's very useful for
the
>> >> >> community if you close the review and mark it as "Submitted" at
the
>> >> Review
>> >> >> board. The button to do this is top right at your own tickets,
right
>> >> next
>> >> >> to  the Download Diff button.
>> >> >>
>> >> >> Jira
>> >> >> * Link: please add the link of the review as an external/web link
so
>> >> it's
>> >> >> easy to navigate to the reviews side
>> >> >> * Status: mark it as "patch available"
>> >> >>
>> >> >> Sqoop community will receive emails about your new ticket and review
>> >> >> request and will review your change.
>> >> >>
>> >> >> Thanks,
>> >> >> Bogi
>> >> >>
>> >> >>
>> >> >> On Sat, Feb 25, 2017 at 2:14 AM, Jilani Shaik <jilani2423@gmail.com
>> >
>> >> >> wrote:
>> >> >>
>> >> >> > Do we have any update?
>> >> >> >
>> >> >> > I did checkout of the 1.4.6 code and done code changes to
achieve
>> >> this
>> >> >> and
>> >> >> > tested in cluster and it is working as expected. Is there
a way I
>> can
>> >> >> > contribute this as a patch and then the committers can validate
>> >> further
>> >> >> and
>> >> >> > suggest if any changes required to move further. Please suggest
>> the
>> >> >> > approach.
>> >> >> >
>> >> >> > Thanks,
>> >> >> > Jilani
>> >> >> >
>> >> >> > On Sun, Feb 5, 2017 at 10:41 PM, Jilani Shaik <
>> jilani2423@gmail.com>
>> >> >> > wrote:
>> >> >> >
>> >> >> > > Hi Liz,
>> >> >> > >
>> >> >> > > lets say we inserted data in a table with initial import,
that
>> >> looks
>> >> >> like
>> >> >> > > this in hbase shell
>> >> >> > >
>> >> >> > >  1                                     column=pay:amount,
>> >> >> > > timestamp=1485129654025, value=4.99
>> >> >> > >  1                                     column=pay:customer_id,
>> >> >> > > timestamp=1485129654025, value=1
>> >> >> > >  1                                     column=pay:last_update,
>> >> >> > > timestamp=1485129654025, value=2017-01-23 05:29:09.0
>> >> >> > >  1                                     column=pay:payment_date,
>> >> >> > > timestamp=1485129654025, value=2005-05-25 11:30:37.0
>> >> >> > >  1                                     column=pay:rental_id,
>> >> >> > > timestamp=1485129654025, value=573
>> >> >> > >  1                                     column=pay:staff_id,
>> >> >> > > timestamp=1485129654025, value=1
>> >> >> > >  10                                    column=pay:amount,
>> >> >> > > timestamp=1485129504390, value=5.99
>> >> >> > >  10                                    column=pay:customer_id,
>> >> >> > > timestamp=1485129504390, value=1
>> >> >> > >  10                                    column=pay:last_update,
>> >> >> > > timestamp=1485129504390, value=2006-02-15 22:12:30.0
>> >> >> > >  10                                    column=pay:payment_date,
>> >> >> > > timestamp=1485129504390, value=2005-07-08 03:17:05.0
>> >> >> > >  10                                    column=pay:rental_id,
>> >> >> > > timestamp=1485129504390, value=4526
>> >> >> > >  10                                    column=pay:staff_id,
>> >> >> > > timestamp=1485129504390, value=2
>> >> >> > >
>> >> >> > >
>> >> >> > > now assume that in source rental_id becomes NULL for
rowkey "1",
>> >> and
>> >> >> then
>> >> >> > > we are doing incremental import into HBase. With current
import
>> the
>> >> >> final
>> >> >> > > HBase data after incremental import will look like this.
>> >> >> > >
>> >> >> > >  1                                     column=pay:amount,
>> >> >> > > timestamp=1485129654025, value=4.99
>> >> >> > >  1                                     column=pay:customer_id,
>> >> >> > > timestamp=1485129654025, value=1
>> >> >> > >  1                                     column=pay:last_update,
>> >> >> > > timestamp=1485129654025, value=2017-02-05 05:29:09.0
>> >> >> > >  1                                     column=pay:payment_date,
>> >> >> > > timestamp=1485129654025, value=2005-05-25 11:30:37.0
>> >> >> > >  1                                     column=pay:rental_id,
>> >> >> > > timestamp=1485129654025, value=573
>> >> >> > >  1                                     column=pay:staff_id,
>> >> >> > > timestamp=1485129654025, value=1
>> >> >> > >  10                                    column=pay:amount,
>> >> >> > > timestamp=1485129504390, value=5.99
>> >> >> > >  10                                    column=pay:customer_id,
>> >> >> > > timestamp=1485129504390, value=1
>> >> >> > >  10                                    column=pay:last_update,
>> >> >> > > timestamp=1485129504390, value=2017-02-05 05:12:30.0
>> >> >> > >  10                                    column=pay:payment_date,
>> >> >> > > timestamp=1485129504390, value=2005-07-08 03:17:05.0
>> >> >> > >  10                                    column=pay:rental_id,
>> >> >> > > timestamp=1485129504390, value=126
>> >> >> > >  10                                    column=pay:staff_id,
>> >> >> > > timestamp=1485129504390, value=2
>> >> >> > >
>> >> >> > >
>> >> >> > >
>> >> >> > > As source column "rental_id" becomes NULL for rowkey
"1", the
>> final
>> >> >> HBase
>> >> >> > > should not have the "rental_id" for this rowkey "1".
I am
>> expecting
>> >> >> below
>> >> >> > > data for these rowkeys.
>> >> >> > >
>> >> >> > >
>> >> >> > >  1                                     column=pay:amount,
>> >> >> > > timestamp=1485129654025, value=4.99
>> >> >> > >  1                                     column=pay:customer_id,
>> >> >> > > timestamp=1485129654025, value=1
>> >> >> > >  1                                     column=pay:last_update,
>> >> >> > > timestamp=1485129654025, value=2017-02-05 05:29:09.0
>> >> >> > >  1                                     column=pay:payment_date,
>> >> >> > > timestamp=1485129654025, value=2005-05-25 11:30:37.0
>> >> >> > >  1                                     column=pay:staff_id,
>> >> >> > > timestamp=1485129654025, value=1
>> >> >> > >  10                                    column=pay:amount,
>> >> >> > > timestamp=1485129504390, value=5.99
>> >> >> > >  10                                    column=pay:customer_id,
>> >> >> > > timestamp=1485129504390, value=1
>> >> >> > >  10                                    column=pay:last_update,
>> >> >> > > timestamp=1485129504390, value=2017-02-05 05:12:30.0
>> >> >> > >  10                                    column=pay:payment_date,
>> >> >> > > timestamp=1485129504390, value=2005-07-08 03:17:05.0
>> >> >> > >  10                                    column=pay:rental_id,
>> >> >> > > timestamp=1485129504390, value=126
>> >> >> > >  10                                    column=pay:staff_id,
>> >> >> > > timestamp=1485129504390, value=2
>> >> >> > >
>> >> >> > >
>> >> >> > > Please let me know if anything required further.
>> >> >> > >
>> >> >> > >
>> >> >> > > Thanks,
>> >> >> > > Jilani
>> >> >> > >
>> >> >> > > On Tue, Jan 31, 2017 at 3:38 AM, Erzsebet Szilagyi <
>> >> >> > > liz.szilagyi@cloudera.com> wrote:
>> >> >> > >
>> >> >> > >> Hi Jilani,
>> >> >> > >> I'm not sure I completely understand what you are
trying to do.
>> >> Could
>> >> >> > you
>> >> >> > >> give us some examples with e.g. 4 columns and 2 rows
of example
>> >> data
>> >> >> > >> showing the changes that happen compared to the changes
you'd
>> >> like to
>> >> >> > see?
>> >> >> > >> Thanks,
>> >> >> > >> Liz
>> >> >> > >>
>> >> >> > >> On Tue, Jan 31, 2017 at 5:18 AM, Jilani Shaik <
>> >> jilani2423@gmail.com>
>> >> >> > >> wrote:
>> >> >> > >>
>> >> >> > >> >
>> >> >> > >> > Please help in resolving the issue, I am going
through source
>> >> code
>> >> >> > some
>> >> >> > >> > how the required nature is missing, But not
sure is it for
>> some
>> >> >> reason
>> >> >> > >> we
>> >> >> > >> > avoided this nature.
>> >> >> > >> >
>> >> >> > >> > Provide me some suggestions how to go with this
scenario.
>> >> >> > >> >
>> >> >> > >> > Thanks,
>> >> >> > >> > Jilani
>> >> >> > >> >
>> >> >> > >> > On Sun, Jan 22, 2017 at 6:45 PM, Jilani Shaik
<
>> >> >> jilani2423@gmail.com>
>> >> >> > >> > wrote:
>> >> >> > >> >
>> >> >> > >> >> Hi,
>> >> >> > >> >>
>> >> >> > >> >> We have a scenario where we are importing
data into HBase
>> with
>> >> >> sqoop
>> >> >> > >> >> incremental import.
>> >> >> > >> >>
>> >> >> > >> >> Lets say we imported a table and later source
table got
>> updated
>> >> >> for
>> >> >> > >> some
>> >> >> > >> >> columns as null values for some rows. Then
while doing
>> >> incremental
>> >> >> > >> import
>> >> >> > >> >> as per HBase these columns should not be
there in HBase
>> table.
>> >> But
>> >> >> > >> right
>> >> >> > >> >> now these columns will be as it is available
with previous
>> >> values.
>> >> >> > >> >>
>> >> >> > >> >> Is there any fix to overcome this issue?
>> >> >> > >> >>
>> >> >> > >> >>
>> >> >> > >> >> Thanks,
>> >> >> > >> >> Jilani
>> >> >> > >> >>
>> >> >> > >> >
>> >> >> > >> >
>> >> >> > >>
>> >> >> > >
>> >> >> > >
>> >> >> >
>> >> >>
>> >> >
>> >> >
>> >>
>> >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message