falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arpit Gupta (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FALCON-389) Submit Hcat export workflow to oozie on source cluster rather than to oozie on destination cluster
Date Mon, 07 Apr 2014 18:15:16 GMT

    [ https://issues.apache.org/jira/browse/FALCON-389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962084#comment-13962084

Arpit Gupta commented on FALCON-389:

The way we fixed this was we had to make sure all oozie servers that falcon talks to had the
hadoop configs for all the hadoop servers falcon talks to.

For example had to change this in oozie-site.xml

          Comma separated AUTHORITY=HADOOP_CONF_DIR, where AUTHORITY is the HOST:PORT of
          the Hadoop service (JobTracker, HDFS). The wildcard '*' configuration is
          used when there is no exact match for an authority. The HADOOP_CONF_DIR contains
          the relevant Hadoop *-site.xml files. If the path is relative is looked within
          the Oozie configuration directory; though the path can be absolute (i.e. to point
          to Hadoop client conf/ directories in the local filesystem.

Essentially for every namenode and RM had to provide the config dir where all the appropriate
hdfs and yarn configs are there. 

One way we can avoid this is that if the table export job was submitted to oozie on the source
cluster and rest of the jobs to oozie on the destination cluster.

> Submit Hcat export workflow to oozie on source cluster rather than to oozie on destination
> --------------------------------------------------------------------------------------------------
>                 Key: FALCON-389
>                 URL: https://issues.apache.org/jira/browse/FALCON-389
>             Project: Falcon
>          Issue Type: Improvement
>    Affects Versions: 0.4
>            Reporter: Arpit Gupta
> Noticed this on hadoop-2 with oozie 4.x that when you run an hcat replication job where
source and destination cluster's are different all jobs are submitted to oozie on the destination
cluster. Then oozie runs an table export job that it submits to RM on cluster 1.
> Now if the oozie server on the target cluster is not running with all hadoop configs
it will not know all the appropriate hadoop configs and yarn job will fail. We saw jobs fail
with errors like
> org.apache.hadoop.security.token.SecretManager$InvalidToken: Password not found for ApplicationAttempt
> on unsecure cluster as well.

This message was sent by Atlassian JIRA

View raw message