sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abraham Elmahrek <...@cloudera.com>
Subject Re: Sqoop to HDFS error Cannot initialize cluster
Date Thu, 30 Jan 2014 20:01:09 GMT
It seems like mapreduce.framework.name is missing from this configuration.
You should be able to use a safety valve to manually add it in Cloudera
Manager. The correct value here is "classic" I believe since you don't have
Yarn deployed.

To add a safety valve configuration for MapReduce, go to Services ->
Mapreduce -> Configuration -> View and Edit -> Service Wide -> Advanced ->
Safety valve configuration for mapred-site.xml. You should able to add the
entry:

<property>
  <name>mapreduce.framework.name</name>
  <value>classic</value>
</property>

Then save and restart MR. Let us know how it goes.

-Abe


On Thu, Jan 30, 2014 at 11:18 AM, Brenden Cobb <Brenden.Cobb@humedica.com>wrote:

>  Mapred-site.xml:
>
>  <!--Autogenerated by Cloudera CM on 2013-12-04T22:38:07.943Z-->
> <configuration>
>   <property>
>     <name>mapred.job.tracker</name>
>     <value>som-dmsandbox01.humedica.net:8021</value>
>   </property>
>   <property>
>     <name>mapred.job.tracker.http.address</name>
>     <value>0.0.0.0:50030</value>
>   </property>
>   <property>
>     <name>mapreduce.job.counters.max</name>
>     <value>120</value>
>   </property>
>   <property>
>     <name>mapred.output.compress</name>
>     <value>false</value>
>   </property>
>   <property>
>     <name>mapred.output.compression.type</name>
>     <value>BLOCK</value>
>   </property>
>   <property>
>     <name>mapred.output.compression.codec</name>
>     <value>org.apache.hadoop.io.compress.DefaultCodec</value>
>   </property>
>   <property>
>     <name>mapred.map.output.compression.codec</name>
>     <value>org.apache.hadoop.io.compress.SnappyCodec</value>
>   </property>
>   <property>
>     <name>mapred.compress.map.output</name>
>     <value>true</value>
>   </property>
>   <property>
>     <name>zlib.compress.level</name>
>     <value>DEFAULT_COMPRESSION</value>
>   </property>
>   <property>
>     <name>io.sort.factor</name>
>     <value>64</value>
>   </property>
>   <property>
>     <name>io.sort.record.percent</name>
>     <value>0.05</value>
>   </property>
>   <property>
>     <name>io.sort.spill.percent</name>
>     <value>0.8</value>
>   </property>
>   <property>
>     <name>mapred.reduce.parallel.copies</name>
>     <value>10</value>
>   </property>
>   <property>
>     <name>mapred.submit.replication</name>
>     <value>2</value>
>   </property>
>   <property>
>     <name>mapred.reduce.tasks</name>
>     <value>2</value>
>   </property>
>   <property>
>     <name>mapred.userlog.retain.hours</name>
>     <value>24</value>
>   </property>
>   <property>
>     <name>io.sort.mb</name>
>     <value>71</value>
>   </property>
>   <property>
>     <name>mapred.child.java.opts</name>
>     <value> -Xmx298061516</value>
>   </property>
>   <property>
>     <name>mapred.job.reuse.jvm.num.tasks</name>
>     <value>1</value>
>   </property>
>   <property>
>     <name>mapred.map.tasks.speculative.execution</name>
>     <value>false</value>
>   </property>
>   <property>
>     <name>mapred.reduce.tasks.speculative.execution</name>
>     <value>false</value>
>   </property>
>   <property>
>     <name>mapred.reduce.slowstart.completed.maps</name>
>     <value>0.8</value>
>   </property>
> </configuration>
>
>   From: Abraham Elmahrek <abe@cloudera.com>
> Reply-To: "user@sqoop.apache.org" <user@sqoop.apache.org>
> Date: Thursday, January 30, 2014 2:13 PM
>
> To: "user@sqoop.apache.org" <user@sqoop.apache.org>
> Subject: Re: Sqoop to HDFS error Cannot initialize cluster
>
>   Hmmm could you provide your mapred-site.xml? It seems like you need to
> update the mapreduce.framework.name to "classic" if you're using MR1.
>
>  -Abe
>
>
> On Thu, Jan 30, 2014 at 11:02 AM, Brenden Cobb <Brenden.Cobb@humedica.com>wrote:
>
>>  Hi Abe-
>>
>>  Sqoop 1.4.3 was installed as part of CDH 4.5
>>
>>  Using the server domain instead of localhost did push things along a
>> bit, but the job is complaining that the LocalJobRunner is on the "master"
>> node in cluster:
>>
>>  14/01/30 13:49:18 INFO mapreduce.Cluster: Failed to use
>> org.apache.hadoop.mapred.LocalClientProtocolProvider due to error: Invalid
>> "mapreduce.jobtracker.address" configuration value for LocalJobRunner : "
>> som-dmsandbox01.humedica.net:8021"
>> 14/01/30 13:49:18 ERROR security.UserGroupInformation:
>> PriviledgedActionException as:oracle (auth:SIMPLE)
>> cause:java.io.IOException: Cannot initialize Cluster. Please check your
>> configuration for mapreduce.framework.name and the correspond server
>> addresses.
>> 14/01/30 13:49:18 ERROR tool.ImportTool: Encountered IOException running
>> import job: java.io.IOException: Cannot initialize Cluster. Please check
>> your configuration for mapreduce.framework.name and the correspond
>> server addresses.
>>  at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:122)
>> at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:84)
>> at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:77)
>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1239)
>>
>>
>>  The instance above mentions som-dmsandbox*01*.humedica.net (the master)
>>  while the machine Im executing on is som-dmsandbox*03*.humedica.net
>>
>>  -BC
>>
>>   From: Abraham Elmahrek <abe@cloudera.com>
>> Reply-To: "user@sqoop.apache.org" <user@sqoop.apache.org>
>> Date: Thursday, January 30, 2014 1:49 PM
>> To: "user@sqoop.apache.org" <user@sqoop.apache.org>
>> Subject: Re: Sqoop to HDFS error Cannot initialize cluster
>>
>>   Hey there,
>>
>>  Sqoop1 is actually just a really heavy client. It will create jobs in
>> MapReduce for data transfering.
>>
>>  With that being said, I'm curious about how sqoop was installed? What
>> version of Sqoop1 are you running? It might be as simple as setting the
>> HADOOP_HOME environment variable or updating one of the configs.
>>
>>  -Abe
>>
>>
>> On Thu, Jan 30, 2014 at 10:36 AM, Brenden Cobb <Brenden.Cobb@humedica.com
>> > wrote:
>>
>>>  I think I have part of the answer.. I'm specifying localhost when I
>>> think I should be using the actual domain, otherwise sqoop thinks it's not
>>> in distributed mode?
>>>
>>>  -BC
>>>
>>>   From: Brenden Cobb <brenden.cobb@humedica.com>
>>> Reply-To: "user@sqoop.apache.org" <user@sqoop.apache.org>
>>> Date: Thursday, January 30, 2014 12:34 PM
>>> To: "user@sqoop.apache.org" <user@sqoop.apache.org>
>>> Subject: Sqoop to HDFS error Cannot initialize cluster
>>>
>>>   Hello-
>>>
>>>  I'm trying to sqoop data from oracle to hdfs but getting the following
>>> error:
>>>
>>>  $ sqoop import --connect jdbc:oracle:thin:@localhost:1521/DB11G
>>> --username sqoop --password xx --table sqoop.test
>>>
>>>  ...
>>>  14/01/30 10:58:10 INFO orm.CompilationManager: Writing jar file:
>>> /tmp/sqoop-oracle/compile/fa0ce9acd6ac6d0c349389a6dbfee62b/sqoop.test.jar
>>> 14/01/30 10:58:10 INFO mapreduce.ImportJobBase: Beginning import of
>>> sqoop.test
>>> 14/01/30 10:58:10 WARN conf.Configuration: mapred.job.tracker is
>>> deprecated. Instead, use mapreduce.jobtracker.address
>>> 14/01/30 10:58:10 WARN conf.Configuration: mapred.jar is deprecated.
>>> Instead, use mapreduce.job.jar
>>> 14/01/30 10:58:10 INFO manager.SqlManager: Executing SQL statement:
>>> SELECT FIRST,LAST,EMAIL FROM sqoop.test WHERE 1=0
>>> 14/01/30 10:58:11 WARN conf.Configuration: mapred.map.tasks is
>>> deprecated. Instead, use mapreduce.job.maps
>>> 14/01/30 10:58:11 ERROR security.UserGroupInformation:
>>> PriviledgedActionException as:oracle (auth:SIMPLE)
>>> cause:java.io.IOException: *Cannot initialize Cluster*. Please check
>>> your configuration for mapreduce.framework.name and the correspond
>>> server addresses.
>>> 14/01/30 10:58:11 ERROR tool.ImportTool: Encountered IOException running
>>> import job: java.io.IOException: Cannot initialize Cluster. Please check
>>> your configuration for mapreduce.framework.name and the correspond
>>> server addresses.
>>>
>>>  at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:122)
>>> at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:84)
>>> at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:77)
>>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1239)
>>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1235)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>> at javax.security.auth.Subject.doAs(Subject.java:396)
>>> at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
>>> at org.apache.hadoop.mapreduce.Job.connect(Job.java:1234)
>>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1263)
>>> at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1287)
>>> at
>>> org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:186)
>>> at
>>> org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:159)
>>> at
>>> org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:247)
>>> at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:606)
>>> at
>>> com.quest.oraoop.OraOopConnManager.importTable(OraOopConnManager.java:260)
>>> at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:413)
>>> at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:502)
>>> at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>>> at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
>>> at org.apache.sqoop.Sqoop.runTool(Sqoop.java:222)
>>> at org.apache.sqoop.Sqoop.runTool(Sqoop.java:231)
>>> at org.apache.sqoop.Sqoop.main(Sqoop.java:240)
>>>
>>>
>>>  Checking just the Database side works ok:
>>>  $ sqoop list-tables --connect jdbc:oracle:thin:@localhost:1521:DB11G
>>> --username sqoop --password xx
>>> Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.
>>> Please set $HCAT_HOME to the root of your HCatalog installation.
>>> 14/01/30 12:12:20 INFO sqoop.Sqoop: Running Sqoop version: 1.4.3-cdh4.5.0
>>> 14/01/30 12:12:20 WARN tool.BaseSqoopTool: Setting your password on the
>>> command-line is insecure. Consider using -P instead.
>>> 14/01/30 12:12:20 INFO manager.SqlManager: Using default fetchSize of
>>> 1000
>>> 14/01/30 12:12:21 INFO manager.OracleManager: Time zone has been set to
>>> GMT
>>> TEST
>>>
>>>
>>>  Any thoughts?
>>>
>>>  Thanks,
>>> BC
>>>
>>
>>
>

Mime
View raw message