hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "zhihai xu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-10924) LocalDistributedCacheManager for concurrent sqoop processes fails to create unique directories
Date Mon, 16 Feb 2015 18:05:11 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-10924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323087#comment-14323087
] 

zhihai xu commented on HADOOP-10924:
------------------------------------

[~ozawa], thanks for the comment.
bq.  I think JobID + timestamp can conflict
Based on the following code in [LocalJobRunner.java|https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapred/LocalJobRunner.java#L753]
which generate the JobID:
{code}
  private static int jobid = 0;
  // used for making sure that local jobs run in different jvms don't
  // collide on staging or job directories
  private int randid;
  public synchronized org.apache.hadoop.mapreduce.JobID getNewJobID() {
    return new org.apache.hadoop.mapreduce.JobID("local" + randid, ++jobid);
  }
{code}
the JobID consists of a random number "randid" and a static variable "jobid".
It looks like the random number "randid" will make sure that local jobs run in different jvms
don't have same JobID and
the static variable "jobid" will further make sure local jobs run in the same jvm don't have
same JobID.
So it looks like "randid" will do the same thing as randomUUID.
Correct me if there are some other corner cases I didn't find.

> LocalDistributedCacheManager for concurrent sqoop processes fails to create unique directories
> ----------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-10924
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10924
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: William Watson
>         Attachments: HADOOP-10924.02.patch
>
>
> Kicking off many sqoop processes in different threads results in:
> {code}
> 2014-08-01 13:47:24 -0400:  INFO - 14/08/01 13:47:22 ERROR tool.ImportTool: Encountered
IOException running import job: java.io.IOException: java.util.concurrent.ExecutionException:
java.io.IOException: Rename cannot overwrite non empty destination directory /tmp/hadoop-hadoop/mapred/local/1406915233073
> 2014-08-01 13:47:24 -0400:  INFO - 	at org.apache.hadoop.mapred.LocalDistributedCacheManager.setup(LocalDistributedCacheManager.java:149)
> 2014-08-01 13:47:24 -0400:  INFO - 	at org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(LocalJobRunner.java:163)
> 2014-08-01 13:47:24 -0400:  INFO - 	at org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:731)
> 2014-08-01 13:47:24 -0400:  INFO - 	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:432)
> 2014-08-01 13:47:24 -0400:  INFO - 	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
> 2014-08-01 13:47:24 -0400:  INFO - 	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
> 2014-08-01 13:47:24 -0400:  INFO - 	at java.security.AccessController.doPrivileged(Native
Method)
> 2014-08-01 13:47:24 -0400:  INFO - 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 2014-08-01 13:47:24 -0400:  INFO - 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> 2014-08-01 13:47:24 -0400:  INFO - 	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
> 2014-08-01 13:47:24 -0400:  INFO - 	at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
> 2014-08-01 13:47:24 -0400:  INFO - 	at org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:186)
> 2014-08-01 13:47:24 -0400:  INFO - 	at org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:159)
> 2014-08-01 13:47:24 -0400:  INFO - 	at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:239)
> 2014-08-01 13:47:24 -0400:  INFO - 	at org.apache.sqoop.manager.SqlManager.importQuery(SqlManager.java:645)
> 2014-08-01 13:47:24 -0400:  INFO - 	at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:415)
> 2014-08-01 13:47:24 -0400:  INFO - 	at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:502)
> 2014-08-01 13:47:24 -0400:  INFO - 	at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
> 2014-08-01 13:47:24 -0400:  INFO - 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> 2014-08-01 13:47:24 -0400:  INFO - 	at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
> 2014-08-01 13:47:24 -0400:  INFO - 	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
> 2014-08-01 13:47:24 -0400:  INFO - 	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
> 2014-08-01 13:47:24 -0400:  INFO - 	at org.apache.sqoop.Sqoop.main(Sqoop.java:238)
> {code}
> If two are kicked off in the same second. The issue is the following lines of code in
the org.apache.hadoop.mapred.LocalDistributedCacheManager class: 
> {code}
>     // Generating unique numbers for FSDownload.
>     AtomicLong uniqueNumberGenerator =
>        new AtomicLong(System.currentTimeMillis());
> {code}
> and 
> {code}
> Long.toString(uniqueNumberGenerator.incrementAndGet())),
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message