sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bejoy ks <bejo...@gmail.com>
Subject Re: [sqoop-user] Error while sqoop-import
Date Wed, 01 Feb 2012 06:34:23 GMT
This user group for SQOOP is deprecated moving the conversation to Apache
Sqoop user group

Hi Bhavesh
            When you do import all tables, SQOOP does the default import
with the PK column in RDBMS as the --split-by column. In most cases in
RDBMS you have a non numeric PK or a composite Key as your PK, in those
circumstances you have to individually import tables from rdbms using
appropriate --split-by column for *each table*.
In your case I believe AddressUid from one table is a non numeric  field
and hence it is causing issues with the boundary SQL fired from SQOOP. (you
need to specify --split-by) So as the bottom line you need to issue
individual import statements for all tables to be imported. Club those in a
shell script and fire the script if want a single point trigger.

Hope it helps!...

Regards
Bejoy.K.S


On Wed, Feb 1, 2012 at 11:44 AM, Bhavesh Shah <bhavesh25shah@gmail.com>wrote:

> Hello Bejoy K S,
> When I am trying for import for single table I executed that one with
> --split-by option  but when I tried for --sqoop-import-all-tables
> I got these errors:
>
>
> hadoop@ubuntu:~/sqoop-1.3.0-cdh3u1/bin$ ./sqoop-import-all-tables
> --connect 'jdbc:sqlserver://192.168.1.1;username=abcd;password=12345;database=HadoopTest'
> --hive-import
> 12/01/31 05:29:49 INFO tool.BaseSqoopTool: Using Hive-specific delimiters
> for output. You can override
> 12/01/31 05:29:49 INFO tool.BaseSqoopTool: delimiters with
> --fields-terminated-by, etc.
> 12/01/31 05:29:49 INFO SqlServer.MSSQLServerManagerFactory: Using
> Microsoft's SQL Server - Hadoop Connector
> 12/01/31 05:29:49 INFO manager.SqlManager: Using default fetchSize of 1000
> 12/01/31 05:29:49 INFO tool.CodeGenTool: Beginning code generation
> 12/01/31 05:29:50 INFO manager.SqlManager: Executing SQL statement: SELECT
> TOP 1 * FROM [Address]
> 12/01/31 05:29:50 INFO manager.SqlManager: Executing SQL statement: SELECT
> TOP 1 * FROM [Address]
> 12/01/31 05:29:50 INFO orm.CompilationManager: HADOOP_HOME is
> /home/hadoop/hadoop-0.20.2-cdh3u2
> 12/01/31 05:29:51 ERROR orm.CompilationManager: Could not rename
> /tmp/sqoop-hadoop/compile/6c3bc3e91230270163a8c4d1c5491071/Address.java to
> /home/hadoop/sqoop-1.3.0-cdh3u1/bin/./Address.java
>
> java.io.IOException: Destination
> '/home/hadoop/sqoop-1.3.0-cdh3u1/bin/./Address.java' already exists
>     at org.apache.commons.io.FileUtils.moveFile(FileUtils.java:1811)
>     at
> com.cloudera.sqoop.orm.CompilationManager.compile(CompilationManager.java:227)
>     at com.cloudera.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:83)
>     at com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:337)
>     at
> com.cloudera.sqoop.tool.ImportAllTablesTool.run(ImportAllTablesTool.java:64)
>
>     at com.cloudera.sqoop.Sqoop.run(Sqoop.java:144)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>     at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:180)
>     at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:219)
>     at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:228)
>     at com.cloudera.sqoop.Sqoop.main(Sqoop.java:237)
> 12/01/31 05:29:51 INFO orm.CompilationManager: Writing jar file:
> /tmp/sqoop-hadoop/compile/6c3bc3e91230270163a8c4d1c5491071/Address.jar
> 12/01/31 05:29:51 INFO mapreduce.ImportJobBase: Beginning import of Address
> 12/01/31 05:29:51 INFO manager.SqlManager: Executing SQL statement: SELECT
> TOP 1 * FROM [Address]
> 12/01/31 05:29:52 INFO db.DataDrivenDBInputFormat: BoundingValsQuery:
> SELECT MIN([AddressUid]), MAX([AddressUid]) FROM [Address]
> 12/01/31 05:29:52 INFO mapred.JobClient: Cleaning up the staging area
> hdfs://localhost:54310/home/hadoop/hadoopTemp/mapred/staging/hadoop/.staging/job_201201301559_0287
> 12/01/31 05:29:52 ERROR tool.ImportAllTablesTool: Encountered IOException
> running import job: java.io.IOException:
> com.microsoft.sqlserver.jdbc.SQLServerException: Operand data type
> uniqueidentifier is invalid for min operator.
>
> -Thanks
> Bhavesh Shah
>
>
>
> On Wed, Feb 1, 2012 at 10:51 AM, Bhavesh Shah <bhavesh25shah@gmail.com>wrote:
>
>> Hello bejoy ks,
>> After trying the import with --split-by any NUMERIC column other than
>> AddressUid? ,
>> it get successfully worked but I have to import all tables from database,
>> so in such case how to manage this?
>>
>> -Thanks
>> Bhavesh Shah
>>
>>
>> On Tue, Jan 31, 2012 at 7:10 PM, bejoy ks <bejoyks@gmail.com> wrote:
>>
>>> Hi Bhavesh
>>>           Did you try the import with --split-by any NUMERIC column
>>> other than AddressUid?
>>> Getting min and max values is necessary to divide the task between
>>> multiple mappers. Say you have n different values for --split-by column,
>>> age. Say the min value is 18 and max value is 58 and you have 4 mappers
>>> doing the import in parallel. the individual queries executed by each
>>> mapper would be
>>> Select * from table_name where age between 18 and 27;
>>> Select * from table_name where age between 28 and 37;
>>> Select * from table_name where age between 38 and 47;
>>> Select * from table_name where age between 48 and 58;
>>>
>>> So that the combined result would be Select * from table_name; the
>>> ultimate goal
>>>
>>> Min value and max value is need to get the total boundary and then
>>> dividing the same by number of mappers would get the scope of individual
>>> mapper.
>>>
>>>
>>>
>>> On Tue, Jan 31, 2012 at 6:52 PM, Bhavesh Shah <bhavesh25shah@gmail.com>wrote:
>>>
>>>> Hi Bejoy KS,
>>>> Thanks for your reply,
>>>> I have tried for split-by but the error is same as previous one. But I
>>>> dont understand one thing as:
>>>>
>>>> 12/01/31 01:22:04 INFO db.DataDrivenDBInputFormat: BoundingValsQuery:
>>>> SELECT MIN(AddressUid), MAX(AddressUid) FROM Address
>>>>  Why it is there while importing?
>>>> Actually I want to try for sqoop-import-all-tables, but I have
>>>> uniqueidentifier in all tables in SQL.
>>>> So what could be the solution for this?
>>>>
>>>> Thanks
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Jan 31, 2012 at 6:44 PM, bejoy ks <bejoyks@gmail.com> wrote:
>>>>
>>>>> Hi Bhavesh
>>>>>        Can you try the import with --split-by <column name> argument
>>>>> as well. Try giving a few columns that has numeric datatypes. If it
>>>>> succeeds chose the column where the data distribution would be uniform
>>>>> across mappers. The import can be something like this
>>>>>
>>>>> ./sqoop-import --driver 'com.microsoft.sqlserver.jdbc.SQLServerDriver'
>>>>> --connect 'jdbc:sqlserver://192.168.1.1;username=abcd;password=12345;database=HadoopTest'
>>>>> --table Address  --split-by someColumn  --hive-import --hive-table address
>>>>> --target-dir /userdata/somedir/dir/table
>>>>>
>>>>>
>>>>> On Tue, Jan 31, 2012 at 5:50 PM, Bhavesh Shah <bhavesh25shah@gmail.com
>>>>> > wrote:
>>>>>
>>>>>> Hello,
>>>>>> I am trying for sqoop-import in hive from SQL Server and I am getting
>>>>>> error as:
>>>>>>
>>>>>> ./sqoop-import --driver 'com.microsoft.sqlserver.jdbc.
>>>>>> SQLServerDriver' --connect 'jdbc:sqlserver://192.168.1.1;username=abcd;password=12345;database=HadoopTest'
>>>>>> --table Address --hive-table address  --hive-import
>>>>>> 12/01/31 01:22:01 INFO tool.BaseSqoopTool: Using Hive-specific
>>>>>> delimiters for output. You can override
>>>>>> 12/01/31 01:22:01 INFO tool.BaseSqoopTool: delimiters with
>>>>>> --fields-terminated-by, etc.
>>>>>> 12/01/31 01:22:01 INFO manager.SqlManager: Using default fetchSize
of
>>>>>> 1000
>>>>>> 12/01/31 01:22:01 INFO tool.CodeGenTool: Beginning code generation
>>>>>> 12/01/31 01:22:01 INFO manager.SqlManager: Executing SQL statement:
>>>>>> SELECT t.* FROM Address AS t WHERE 1=0
>>>>>> 12/01/31 01:22:02 INFO manager.SqlManager: Executing SQL statement:
>>>>>> SELECT t.* FROM Address AS t WHERE 1=0
>>>>>> 12/01/31 01:22:02 INFO orm.CompilationManager: HADOOP_HOME is
>>>>>> /home/hadoop/hadoop-0.20.2-cdh3u2
>>>>>> 12/01/31 01:22:03 ERROR orm.CompilationManager: Could not rename
>>>>>> /tmp/sqoop-hadoop/compile/0b20951314826ba92fce53b3fe95c464/Address.java
to
>>>>>> /home/hadoop/sqoop-1.3.0-cdh3u1/bin/./Address.java
>>>>>> java.io.IOException: Destination
>>>>>> '/home/hadoop/sqoop-1.3.0-cdh3u1/bin/./Address.java' already exists
>>>>>>     at org.apache.commons.io.FileUtils.moveFile(FileUtils.java:1811)
>>>>>>     at
>>>>>> com.cloudera.sqoop.orm.CompilationManager.compile(CompilationManager.java:227)
>>>>>>     at
>>>>>> com.cloudera.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:83)
>>>>>>     at
>>>>>> com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:337)
>>>>>>     at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:423)
>>>>>>     at com.cloudera.sqoop.Sqoop.run(Sqoop.java:144)
>>>>>>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>>>>>     at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:180)
>>>>>>     at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:219)
>>>>>>     at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:228)
>>>>>>     at com.cloudera.sqoop.Sqoop.main(Sqoop.java:237)
>>>>>> 12/01/31 01:22:03 INFO orm.CompilationManager: Writing jar file:
>>>>>> /tmp/sqoop-hadoop/compile/0b20951314826ba92fce53b3fe95c464/Address.jar
>>>>>> 12/01/31 01:22:03 INFO mapreduce.ImportJobBase: Beginning import
of
>>>>>> Address
>>>>>> 12/01/31 01:22:03 INFO manager.SqlManager: Executing SQL statement:
>>>>>> SELECT t.* FROM Address AS t WHERE 1=0
>>>>>> 12/01/31 01:22:04 INFO db.DataDrivenDBInputFormat: BoundingValsQuery:
>>>>>> SELECT MIN(AddressUid), MAX(AddressUid) FROM Address
>>>>>> 12/01/31 01:22:04 INFO mapred.JobClient: Cleaning up the staging
area
>>>>>> hdfs://localhost:54310/home/hadoop/hadoopTemp/mapred/staging/hadoop/.staging/job_201201301559_0274
>>>>>> *12/01/31 01:22:04 ERROR tool.ImportTool: Encountered IOException
>>>>>> running import job: java.io.IOException:
>>>>>> com.microsoft.sqlserver.jdbc.SQLServerException: Operand data type
>>>>>> uniqueidentifier is invalid for min operator.*
>>>>>>     at
>>>>>> com.cloudera.sqoop.mapreduce.db.DataDrivenDBInputFormat.getSplits(DataDrivenDBInputFormat.java:204)
>>>>>>     at
>>>>>> org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:944)
>>>>>>     at
>>>>>> org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:961)
>>>>>>     at
>>>>>> org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
>>>>>>     at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:880)
>>>>>>     at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833)
>>>>>>     at java.security.AccessController.doPrivileged(Native Method)
>>>>>>     at javax.security.auth.Subject.doAs(Subject.java:396)
>>>>>>     at
>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
>>>>>>     at
>>>>>> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833)
>>>>>>     at org.apache.hadoop.mapreduce.Job.submit(Job.java:476)
>>>>>>     at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:506)
>>>>>>     at
>>>>>> com.cloudera.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:121)
>>>>>>     at
>>>>>> com.cloudera.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:181)
>>>>>>     at
>>>>>> com.cloudera.sqoop.manager.SqlManager.importTable(SqlManager.java:405)
>>>>>>     at
>>>>>> com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:350)
>>>>>>     at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:423)
>>>>>>     at com.cloudera.sqoop.Sqoop.run(Sqoop.java:144)
>>>>>>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>>>>>     at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:180)
>>>>>>     at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:219)
>>>>>>     at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:228)
>>>>>>     at com.cloudera.sqoop.Sqoop.main(Sqoop.java:237)
>>>>>> *Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: Operand
>>>>>> data type uniqueidentifier is invalid for min operator.*
>>>>>>     at
>>>>>> com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:197)
>>>>>>     at
>>>>>> com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServerStatement.java:1493)
>>>>>>     at
>>>>>> com.microsoft.sqlserver.jdbc.SQLServerStatement.doExecuteStatement(SQLServerStatement.java:775)
>>>>>>     at
>>>>>> com.microsoft.sqlserver.jdbc.SQLServerStatement$StmtExecCmd.doExecute(SQLServerStatement.java:676)
>>>>>>     at
>>>>>> com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:4575)
>>>>>>     at
>>>>>> com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:1400)
>>>>>>     at
>>>>>> com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(SQLServerStatement.java:179)
>>>>>>     at
>>>>>> com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(SQLServerStatement.java:154)
>>>>>>     at
>>>>>> com.microsoft.sqlserver.jdbc.SQLServerStatement.executeQuery(SQLServerStatement.java:611)
>>>>>>     at
>>>>>> com.cloudera.sqoop.mapreduce.db.DataDrivenDBInputFormat.getSplits(DataDrivenDBInputFormat.java:189)
>>>>>>     ... 22 more
>>>>>>
>>>>>> Pls suggest me some solution
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Thanks and Regards,
>>>>>> Bhavesh Shah
>>>>>>
>>>>>>  --
>>>>>> NOTE: The mailing list sqoop-user@cloudera.org is deprecated in
>>>>>> favor of Apache Sqoop mailing list sqoop-user@incubator.apache.org.
>>>>>> Please subscribe to it by sending an email to
>>>>>> incubator-sqoop-user-subscribe@apache.org.
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Regards
>>>>>        Bejoy
>>>>>
>>>>> --
>>>>> NOTE: The mailing list sqoop-user@cloudera.org is deprecated in favor
>>>>> of Apache Sqoop mailing list sqoop-user@incubator.apache.org. Please
>>>>> subscribe to it by sending an email to
>>>>> incubator-sqoop-user-subscribe@apache.org.
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>> Bhavesh Shah
>>>>
>>>>  --
>>>> NOTE: The mailing list sqoop-user@cloudera.org is deprecated in favor
>>>> of Apache Sqoop mailing list sqoop-user@incubator.apache.org. Please
>>>> subscribe to it by sending an email to
>>>> incubator-sqoop-user-subscribe@apache.org.
>>>>
>>>
>>>
>>>
>>> --
>>> Regards
>>>        Bejoy
>>>
>>> --
>>> NOTE: The mailing list sqoop-user@cloudera.org is deprecated in favor
>>> of Apache Sqoop mailing list sqoop-user@incubator.apache.org. Please
>>> subscribe to it by sending an email to
>>> incubator-sqoop-user-subscribe@apache.org.
>>>
>>
>>
>
>
> --
> Regards,
> Bhavesh Shah
>
>  --
> NOTE: The mailing list sqoop-user@cloudera.org is deprecated in favor of
> Apache Sqoop mailing list sqoop-user@incubator.apache.org. Please
> subscribe to it by sending an email to
> incubator-sqoop-user-subscribe@apache.org.
>



-- 
Regards
       Bejoy

Mime
View raw message