sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jilani Shaik <jilani2...@gmail.com>
Subject Re: sqoop hbase incremental import - Sqoop 1.4.6
Date Fri, 10 Mar 2017 13:57:10 GMT
Yes you are correct, I am running from eclipse. Will run from command line.

Sent from my iPhone

> On Mar 10, 2017, at 6:48 AM, Boglarka Egyed <bogi@cloudera.com> wrote:
> 
> Hi Jilani,
> 
> Please try to execute "ant compile" and then "ant test" from command line, it will run
unit tests. If I understood you well, you tried run tests from Eclipse which won't work.
> 
> Thanks,
> Bogi
> 
>> On Fri, Mar 10, 2017 at 6:10 AM, Jilani Shaik <jilani2423@gmail.com> wrote:
>> Hi Bogi,
>> 
>> Thanks for the providing direction.
>> 
>> As you suggested I explored further and resolved the issue and able to test
>> the fix on trunk based code changes in my hadoop cluster.
>> 
>> Root cause for my issue:
>> 1.4.6 code base using the same avro version which is there in my hadoop
>> cluster so there is no issue for that jar component, whereas trunk code
>> base using the avro-1.8.1 jar files, which is not available in my hadoop
>> cluster.
>> 
>> Can you suggest how to do unit test etc for this component.
>> 
>> I tried with "test" target, I am getting all as failed as below.
>> 
>> Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.415 sec
>>     [junit] Running com.cloudera.sqoop.TestDirectImport
>>     [junit] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed:
>> 13.705 sec
>>     [junit] Test com.cloudera.sqoop.TestDirectImport FAILED
>>     [junit] Running com.cloudera.sqoop.TestExport
>>     [junit] Tests run: 17, Failures: 0, Errors: 17, Skipped: 0, Time
>> elapsed: 22.564 sec
>>     [junit] Test com.cloudera.sqoop.TestExport FAILED
>>     [junit] Running com.cloudera.sqoop.TestExportUpdate
>> 
>> Do I need to do any changes? I am running from eclipse with "test" target.
>> 
>> Thanks,
>> Jilani
>> 
>> 
>> On Thu, Mar 9, 2017 at 9:42 PM, Jilani Shaik <jilani2423@gmail.com> wrote:
>> 
>> > Hi Bogi,
>> >
>> > - Prepared jar using trunk with "jar-all" target
>> >
>> > - Copied the jar to /opt/mapr/sqoop/sqoop-1.4.6/
>> >
>> > - Moved out existing jar to some other location
>> >
>> > - then execute the below command to do import
>> > sqoop import --connect jdbc:mysql://10.0.0.300/database123 --verbose
>> > --username test --password test123$ --table payment -m 2 --hbase-table
>> > /database/demoapp/hbase/payment --column-family pay --hbase-row-key
>> > payment_id --incremental lastmodified --merge-key payment_id --check-column
>> > last_update --last-value '2017-01-08 08:02:05.0'
>> >
>> >
>> > The same steps I followed for both the jar from trunk code vs 1.4.6 branch
>> > code.
>> >
>> > Where are you suggesting the multiple avro jars, is it at the time of jar
>> > preparation or running the command using the jar.
>> >
>> >
>> > Thanks,
>> > Jilani
>> >
>> > On Thu, Mar 9, 2017 at 9:21 AM, Boglarka Egyed <bogi@cloudera.com> wrote:
>> >
>> >> Hi Jilani,
>> >>
>> >> I suspect that you have an old version of Avro or even multiple Avro
>> >> versions on your classpath and thus Sqoop uses an older one.
>> >>
>> >> Could you please provide a list of the exact commands you have performed
>> >> so that I can reproduce the issue?
>> >>
>> >> Thanks,
>> >> Bogi
>> >>
>> >> On Thu, Mar 9, 2017 at 2:51 AM, Jilani Shaik <jilani2423@gmail.com>
>> >> wrote:
>> >>
>> >>> Can some one provide me the pointers what am I missing with trunk vs
>> >>> 1.4.6
>> >>> builds, which is giving some error as mentioned in below mail chain.
>> >>>
>> >>> I did followed the same ant target to prepare jar for both branches,
but
>> >>> even though 1.4.6 jar is different to 1.4.7 which is created form trunk.
>> >>>
>> >>> Thanks,
>> >>> Jilani
>> >>>
>> >>>
>> >>> On Wed, Mar 8, 2017 at 3:29 AM, Jilani Shaik <jilani2423@gmail.com>
>> >>> wrote:
>> >>>
>> >>> > Hi Bogi,
>> >>> >
>> >>> > I am getting below error, when I have prepared jar from trunk and
try
>> >>> to
>> >>> > do sqoop import with mysql database table and got below exception,
>> >>> where as
>> >>> > similar changes are working with branch 1.4.6.
>> >>> >
>> >>> >
>> >>> > 17/03/08 01:06:25 INFO sqoop.Sqoop: Running Sqoop version:
>> >>> 1.4.7-SNAPSHOT
>> >>> > 17/03/08 01:06:25 DEBUG tool.BaseSqoopTool: Enabled debug logging.
>> >>> > 17/03/08 01:06:25 WARN tool.BaseSqoopTool: Setting your password
on the
>> >>> > command-line is insecure. Consider using -P instead.
>> >>> > 17/03/08 01:06:25 DEBUG sqoop.ConnFactory: Loaded manager factory:
>> >>> > org.apache.sqoop.manager.oracle.OraOopManagerFactory
>> >>> > 17/03/08 01:06:25 DEBUG sqoop.ConnFactory: Loaded manager factory:
>> >>> > com.cloudera.sqoop.manager.DefaultManagerFactory
>> >>> > 17/03/08 01:06:25 DEBUG sqoop.ConnFactory: Trying ManagerFactory:
>> >>> > org.apache.sqoop.manager.oracle.OraOopManagerFactory
>> >>> > 17/03/08 01:06:25 DEBUG oracle.OraOopManagerFactory: Data Connector
for
>> >>> > Oracle and Hadoop can be called by Sqoop!
>> >>> > 17/03/08 01:06:25 DEBUG sqoop.ConnFactory: Trying ManagerFactory:
>> >>> > com.cloudera.sqoop.manager.DefaultManagerFactory
>> >>> > 17/03/08 01:06:25 DEBUG manager.DefaultManagerFactory: Trying with
>> >>> scheme:
>> >>> > jdbc:mysql:
>> >>> > Exception in thread "main" java.lang.NoClassDefFoundError:
>> >>> > org/apache/avro/LogicalType
>> >>> >         at org.apache.sqoop.manager.DefaultManagerFactory.accept(
>> >>> > DefaultManagerFactory.java:67)
>> >>> >         at org.apache.sqoop.ConnFactory.g
>> >>> etManager(ConnFactory.java:184)
>> >>> >         at org.apache.sqoop.tool.BaseSqoopTool.init(
>> >>> > BaseSqoopTool.java:270)
>> >>> >         at org.apache.sqoop.tool.ImportTool.init(ImportTool.java:97)
>> >>> >         at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:617)
>> >>> >         at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
>> >>> >         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>> >>> >         at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
>> >>> >         at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
>> >>> >         at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
>> >>> >         at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
>> >>> > Caused by: java.lang.ClassNotFoundException:
>> >>> org.apache.avro.LogicalType
>> >>> >         at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>> >>> >         at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>> >>> >         at sun.misc.Launcher$AppClassLoad
>> >>> er.loadClass(Launcher.java:331)
>> >>> >         at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>> >>> >         ... 11 more
>> >>> >
>> >>> > Please let me know what is missing and how to resolve this exception,
>> >>> Let
>> >>> > me know if you need further details.
>> >>> >
>> >>> > Thanks,
>> >>> > Jilani
>> >>> >
>> >>> > On Wed, Mar 1, 2017 at 4:38 AM, Boglarka Egyed <bogi@cloudera.com>
>> >>> wrote:
>> >>> >
>> >>> >> Hi Jilani,
>> >>> >>
>> >>> >> This is an example: SQOOP-3053
>> >>> >> <https://issues.apache.org/jira/browse/SQOOP-3053> with
the review
>> >>> >> <https://reviews.apache.org/r/54206/> linked. Please
make your
>> >>> changes on
>> >>> >> trunk as it will be used to cut the future release so your
patch
>> >>> >> definitely
>> >>> >> needs to be be able to apply on it.
>> >>> >>
>> >>> >> Thanks,
>> >>> >> Bogi
>> >>> >>
>> >>> >> On Wed, Mar 1, 2017 at 3:46 AM, Jilani Shaik <jilani2423@gmail.com>
>> >>> >> wrote:
>> >>> >>
>> >>> >> > Hi Bogi,
>> >>> >> >
>> >>> >> > Can you provide me sample Jira tickets and Review requests
similar
>> >>> to
>> >>> >> > this, to proceed further.
>> >>> >> >
>> >>> >> > I applied the code changes from sqoop git from this branch
>> >>> >> > "sqoop-release-1.4.6-rc0", If you suggest right branch
I will take
>> >>> the
>> >>> >> code
>> >>> >> > from there and apply the changes before submit review
for request.
>> >>> >> >
>> >>> >> > Thanks,
>> >>> >> > Jilani
>> >>> >> >
>> >>> >> > On Mon, Feb 27, 2017 at 3:05 AM, Boglarka Egyed <bogi@cloudera.com>
>> >>> >> wrote:
>> >>> >> >
>> >>> >> >> Hi Jilani,
>> >>> >> >>
>> >>> >> >> To get your change committed please do the following:
>> >>> >> >> * Open a JIRA ticket for your change in Apache's JIRA
system
>> >>> >> >> <https://issues.apache.org/jira/browse/SQOOP/>
for project Sqoop
>> >>> >> >> * Create a review request at Apache's review board
>> >>> >> >> <https://reviews.apache.org/r/> for project
Sqoop and link it to
>> >>> the
>> >>> >> JIRA
>> >>> >> >>
>> >>> >> >> ticket
>> >>> >> >>
>> >>> >> >> Please consider the guidelines below:
>> >>> >> >>
>> >>> >> >> Review board
>> >>> >> >> * Summary: generate your summary using the issue's
jira key + jira
>> >>> >> title
>> >>> >> >> * Groups: add the relevant group so everyone on the
project will
>> >>> know
>> >>> >> >> about
>> >>> >> >> your patch (Sqoop)
>> >>> >> >> * Bugs: add the issue's jira key so it's easy to navigate
to the
>> >>> jira
>> >>> >> side
>> >>> >> >> * Repository: sqoop-trunk for Sqoop1 or sqoop-sqoop2
for Sqoop2
>> >>> >> >> * And as soon as the patch gets committed, it's very
useful for the
>> >>> >> >> community if you close the review and mark it as "Submitted"
at the
>> >>> >> Review
>> >>> >> >> board. The button to do this is top right at your
own tickets,
>> >>> right
>> >>> >> next
>> >>> >> >> to  the Download Diff button.
>> >>> >> >>
>> >>> >> >> Jira
>> >>> >> >> * Link: please add the link of the review as an external/web
link
>> >>> so
>> >>> >> it's
>> >>> >> >> easy to navigate to the reviews side
>> >>> >> >> * Status: mark it as "patch available"
>> >>> >> >>
>> >>> >> >> Sqoop community will receive emails about your new
ticket and
>> >>> review
>> >>> >> >> request and will review your change.
>> >>> >> >>
>> >>> >> >> Thanks,
>> >>> >> >> Bogi
>> >>> >> >>
>> >>> >> >>
>> >>> >> >> On Sat, Feb 25, 2017 at 2:14 AM, Jilani Shaik <
>> >>> jilani2423@gmail.com>
>> >>> >> >> wrote:
>> >>> >> >>
>> >>> >> >> > Do we have any update?
>> >>> >> >> >
>> >>> >> >> > I did checkout of the 1.4.6 code and done code
changes to achieve
>> >>> >> this
>> >>> >> >> and
>> >>> >> >> > tested in cluster and it is working as expected.
Is there a way
>> >>> I can
>> >>> >> >> > contribute this as a patch and then the committers
can validate
>> >>> >> further
>> >>> >> >> and
>> >>> >> >> > suggest if any changes required to move further.
Please suggest
>> >>> the
>> >>> >> >> > approach.
>> >>> >> >> >
>> >>> >> >> > Thanks,
>> >>> >> >> > Jilani
>> >>> >> >> >
>> >>> >> >> > On Sun, Feb 5, 2017 at 10:41 PM, Jilani Shaik
<
>> >>> jilani2423@gmail.com>
>> >>> >> >> > wrote:
>> >>> >> >> >
>> >>> >> >> > > Hi Liz,
>> >>> >> >> > >
>> >>> >> >> > > lets say we inserted data in a table with
initial import, that
>> >>> >> looks
>> >>> >> >> like
>> >>> >> >> > > this in hbase shell
>> >>> >> >> > >
>> >>> >> >> > >  1                                     column=pay:amount,
>> >>> >> >> > > timestamp=1485129654025, value=4.99
>> >>> >> >> > >  1                                     column=pay:customer_id,
>> >>> >> >> > > timestamp=1485129654025, value=1
>> >>> >> >> > >  1                                     column=pay:last_update,
>> >>> >> >> > > timestamp=1485129654025, value=2017-01-23
05:29:09.0
>> >>> >> >> > >  1                                     column=pay:payment_date,
>> >>> >> >> > > timestamp=1485129654025, value=2005-05-25
11:30:37.0
>> >>> >> >> > >  1                                     column=pay:rental_id,
>> >>> >> >> > > timestamp=1485129654025, value=573
>> >>> >> >> > >  1                                     column=pay:staff_id,
>> >>> >> >> > > timestamp=1485129654025, value=1
>> >>> >> >> > >  10                                    column=pay:amount,
>> >>> >> >> > > timestamp=1485129504390, value=5.99
>> >>> >> >> > >  10                                    column=pay:customer_id,
>> >>> >> >> > > timestamp=1485129504390, value=1
>> >>> >> >> > >  10                                    column=pay:last_update,
>> >>> >> >> > > timestamp=1485129504390, value=2006-02-15
22:12:30.0
>> >>> >> >> > >  10                                    column=pay:payment_date,
>> >>> >> >> > > timestamp=1485129504390, value=2005-07-08
03:17:05.0
>> >>> >> >> > >  10                                    column=pay:rental_id,
>> >>> >> >> > > timestamp=1485129504390, value=4526
>> >>> >> >> > >  10                                    column=pay:staff_id,
>> >>> >> >> > > timestamp=1485129504390, value=2
>> >>> >> >> > >
>> >>> >> >> > >
>> >>> >> >> > > now assume that in source rental_id becomes
NULL for rowkey
>> >>> "1",
>> >>> >> and
>> >>> >> >> then
>> >>> >> >> > > we are doing incremental import into HBase.
With current
>> >>> import the
>> >>> >> >> final
>> >>> >> >> > > HBase data after incremental import will
look like this.
>> >>> >> >> > >
>> >>> >> >> > >  1                                     column=pay:amount,
>> >>> >> >> > > timestamp=1485129654025, value=4.99
>> >>> >> >> > >  1                                     column=pay:customer_id,
>> >>> >> >> > > timestamp=1485129654025, value=1
>> >>> >> >> > >  1                                     column=pay:last_update,
>> >>> >> >> > > timestamp=1485129654025, value=2017-02-05
05:29:09.0
>> >>> >> >> > >  1                                     column=pay:payment_date,
>> >>> >> >> > > timestamp=1485129654025, value=2005-05-25
11:30:37.0
>> >>> >> >> > >  1                                     column=pay:rental_id,
>> >>> >> >> > > timestamp=1485129654025, value=573
>> >>> >> >> > >  1                                     column=pay:staff_id,
>> >>> >> >> > > timestamp=1485129654025, value=1
>> >>> >> >> > >  10                                    column=pay:amount,
>> >>> >> >> > > timestamp=1485129504390, value=5.99
>> >>> >> >> > >  10                                    column=pay:customer_id,
>> >>> >> >> > > timestamp=1485129504390, value=1
>> >>> >> >> > >  10                                    column=pay:last_update,
>> >>> >> >> > > timestamp=1485129504390, value=2017-02-05
05:12:30.0
>> >>> >> >> > >  10                                    column=pay:payment_date,
>> >>> >> >> > > timestamp=1485129504390, value=2005-07-08
03:17:05.0
>> >>> >> >> > >  10                                    column=pay:rental_id,
>> >>> >> >> > > timestamp=1485129504390, value=126
>> >>> >> >> > >  10                                    column=pay:staff_id,
>> >>> >> >> > > timestamp=1485129504390, value=2
>> >>> >> >> > >
>> >>> >> >> > >
>> >>> >> >> > >
>> >>> >> >> > > As source column "rental_id" becomes NULL
for rowkey "1", the
>> >>> final
>> >>> >> >> HBase
>> >>> >> >> > > should not have the "rental_id" for this
rowkey "1". I am
>> >>> expecting
>> >>> >> >> below
>> >>> >> >> > > data for these rowkeys.
>> >>> >> >> > >
>> >>> >> >> > >
>> >>> >> >> > >  1                                     column=pay:amount,
>> >>> >> >> > > timestamp=1485129654025, value=4.99
>> >>> >> >> > >  1                                     column=pay:customer_id,
>> >>> >> >> > > timestamp=1485129654025, value=1
>> >>> >> >> > >  1                                     column=pay:last_update,
>> >>> >> >> > > timestamp=1485129654025, value=2017-02-05
05:29:09.0
>> >>> >> >> > >  1                                     column=pay:payment_date,
>> >>> >> >> > > timestamp=1485129654025, value=2005-05-25
11:30:37.0
>> >>> >> >> > >  1                                     column=pay:staff_id,
>> >>> >> >> > > timestamp=1485129654025, value=1
>> >>> >> >> > >  10                                    column=pay:amount,
>> >>> >> >> > > timestamp=1485129504390, value=5.99
>> >>> >> >> > >  10                                    column=pay:customer_id,
>> >>> >> >> > > timestamp=1485129504390, value=1
>> >>> >> >> > >  10                                    column=pay:last_update,
>> >>> >> >> > > timestamp=1485129504390, value=2017-02-05
05:12:30.0
>> >>> >> >> > >  10                                    column=pay:payment_date,
>> >>> >> >> > > timestamp=1485129504390, value=2005-07-08
03:17:05.0
>> >>> >> >> > >  10                                    column=pay:rental_id,
>> >>> >> >> > > timestamp=1485129504390, value=126
>> >>> >> >> > >  10                                    column=pay:staff_id,
>> >>> >> >> > > timestamp=1485129504390, value=2
>> >>> >> >> > >
>> >>> >> >> > >
>> >>> >> >> > > Please let me know if anything required
further.
>> >>> >> >> > >
>> >>> >> >> > >
>> >>> >> >> > > Thanks,
>> >>> >> >> > > Jilani
>> >>> >> >> > >
>> >>> >> >> > > On Tue, Jan 31, 2017 at 3:38 AM, Erzsebet
Szilagyi <
>> >>> >> >> > > liz.szilagyi@cloudera.com> wrote:
>> >>> >> >> > >
>> >>> >> >> > >> Hi Jilani,
>> >>> >> >> > >> I'm not sure I completely understand
what you are trying to
>> >>> do.
>> >>> >> Could
>> >>> >> >> > you
>> >>> >> >> > >> give us some examples with e.g. 4 columns
and 2 rows of
>> >>> example
>> >>> >> data
>> >>> >> >> > >> showing the changes that happen compared
to the changes you'd
>> >>> >> like to
>> >>> >> >> > see?
>> >>> >> >> > >> Thanks,
>> >>> >> >> > >> Liz
>> >>> >> >> > >>
>> >>> >> >> > >> On Tue, Jan 31, 2017 at 5:18 AM, Jilani
Shaik <
>> >>> >> jilani2423@gmail.com>
>> >>> >> >> > >> wrote:
>> >>> >> >> > >>
>> >>> >> >> > >> >
>> >>> >> >> > >> > Please help in resolving the issue,
I am going through
>> >>> source
>> >>> >> code
>> >>> >> >> > some
>> >>> >> >> > >> > how the required nature is missing,
But not sure is it for
>> >>> some
>> >>> >> >> reason
>> >>> >> >> > >> we
>> >>> >> >> > >> > avoided this nature.
>> >>> >> >> > >> >
>> >>> >> >> > >> > Provide me some suggestions how
to go with this scenario.
>> >>> >> >> > >> >
>> >>> >> >> > >> > Thanks,
>> >>> >> >> > >> > Jilani
>> >>> >> >> > >> >
>> >>> >> >> > >> > On Sun, Jan 22, 2017 at 6:45 PM,
Jilani Shaik <
>> >>> >> >> jilani2423@gmail.com>
>> >>> >> >> > >> > wrote:
>> >>> >> >> > >> >
>> >>> >> >> > >> >> Hi,
>> >>> >> >> > >> >>
>> >>> >> >> > >> >> We have a scenario where we
are importing data into HBase
>> >>> with
>> >>> >> >> sqoop
>> >>> >> >> > >> >> incremental import.
>> >>> >> >> > >> >>
>> >>> >> >> > >> >> Lets say we imported a table
and later source table got
>> >>> updated
>> >>> >> >> for
>> >>> >> >> > >> some
>> >>> >> >> > >> >> columns as null values for
some rows. Then while doing
>> >>> >> incremental
>> >>> >> >> > >> import
>> >>> >> >> > >> >> as per HBase these columns
should not be there in HBase
>> >>> table.
>> >>> >> But
>> >>> >> >> > >> right
>> >>> >> >> > >> >> now these columns will be as
it is available with previous
>> >>> >> values.
>> >>> >> >> > >> >>
>> >>> >> >> > >> >> Is there any fix to overcome
this issue?
>> >>> >> >> > >> >>
>> >>> >> >> > >> >>
>> >>> >> >> > >> >> Thanks,
>> >>> >> >> > >> >> Jilani
>> >>> >> >> > >> >>
>> >>> >> >> > >> >
>> >>> >> >> > >> >
>> >>> >> >> > >>
>> >>> >> >> > >
>> >>> >> >> > >
>> >>> >> >> >
>> >>> >> >>
>> >>> >> >
>> >>> >> >
>> >>> >>
>> >>> >
>> >>> >
>> >>>
>> >>
>> >>
>> >
> 

Mime
  • Unnamed multipart/alternative (inline, 7-Bit, 0 bytes)
View raw message