sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oliver Meyn <oli...@mineallmeyn.com>
Subject Re: hbase ignored when using --direct and mysql
Date Fri, 06 Jan 2012 08:34:02 GMT
Hi Jarcec,

doc jira: https://issues.apache.org/jira/browse/SQOOP-421

code jira: https://issues.apache.org/jira/browse/SQOOP-422

Please check that I got the components and versions right.

Thanks,
Oliver

On 2012-01-06, at 9:17 AM, Jarek Jarcec Cecho wrote:

> Hi Oliver,
> I definitely agree that this should be mentioned in the docs. Can you please create JIRA
for that?
> 
> Could you also create second JIRA that will cover changes on sqoop source code side so
that we stop silently ignoring --hbase* parameters and rise an exception in case that user
will supply incompatible arguments?
> 
> Thank,
> Jarcec
> 
> On Fri, Jan 06, 2012 at 08:57:17AM +0100, Oliver Meyn wrote:
>> Thanks Arvind - though I guess you mean "transferred data into HBase" below.  In
any case this should be in the docs, since it cost me half a day and shouldn't cost anyone
else that.  I'm not sure how that might get updated since the user guide looks to be part
of the source, so should I make a Jira issue?
>> 
>> Thanks,
>> Oliver
>> 
>> On 2012-01-05, at 7:18 PM, Arvind Prabhakar wrote:
>> 
>>> Hi Oliver,
>>> 
>>> This is a limitation of the direct MySQL connector. This connector
>>> uses the MySQL batch utilities (mysqldump/mysqlimport) to do the data
>>> transfer directly over to HDFS. It does not support populating the
>>> transferred data into HDFS.
>>> 
>>> This is one of the areas of improvement in Sqoop 2. See [1] for more details.
>>> 
>>> [1] https://cwiki.apache.org/confluence/display/SQOOP/Sqoop+2#Sqoop2-IntroducingaReducePhase
>>> 
>>> Thanks,
>>> Arvind
>>> 
>>> On Thu, Jan 5, 2012 at 7:22 AM, Oliver Meyn <oliver@mineallmeyn.com> wrote:
>>>> Hi all,
>>>> 
>>>> I'm trying to sqoop from mysql into HBase.  Everything works fine if I don't
use --direct.  When I add --direct however the results get written into hdfs, and no error
messages are generated.  It's as if the --hbase* params are all being ignored.  Is this by
design?  Or a bug?
>>>> 
>>>> I'm using sqoop-1.3.0-cdh3u2, and all other hadoop-y pieces are cdh3u2. 
Here's my options file (I've tried on the command line, and I've tried moving the --direct
option to the start, end, and middle of the options list - all to no avail).
>>>> 
>>>> import
>>>> # use mysqldump
>>>> --direct
>>>> --mysql-delimiters
>>>> 
>>>> #connect to db
>>>> --connect
>>>> jdbc:mysql://xxx/portal
>>>> --username
>>>> xxx
>>>> --password
>>>> xxx
>>>> 
>>>> # from table raw_occurrence_record
>>>> --table
>>>> raw_occurrence_record
>>>> --split-by
>>>> id
>>>> --where
>>>> 'id < 100000'
>>>> 
>>>> # to hbase
>>>> --hbase-table
>>>> sqoop_ror_mini
>>>> --column-family
>>>> v
>>>> --hbase-create-table
>>>> 
>>>> # code generation reuse
>>>> --jar-file
>>>> raw_occurrence_record.jar
>>>> --class-name
>>>> raw_occurrence_record
>>>> 
>>>> Thanks,
>>>> Oliver
>>>> 
>>>> 
>>>> 
>> 
>> 
>> --
>> Oliver Meyn
>> Mine All Meyn, Inc.
>> Toronto: (416) 524-2240
>> Copenhagen: +45 51 50 38 90
>> "Things should be made as simple as possible, but no simpler." - Albert Einstein
>> 
>> 
>> 
>> 


--
Oliver Meyn
Mine All Meyn, Inc.
Toronto: (416) 524-2240
Copenhagen: +45 51 50 38 90
"Things should be made as simple as possible, but no simpler." - Albert Einstein





Mime
View raw message