hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrian Sandulescu <sandulescu.adr...@gmail.com>
Subject exportSnapshot MISSING_FILES
Date Tue, 08 Oct 2013 13:35:03 GMT
Hello everyone,

I'm using this tool to export and "import" snapshots from S3:

I'm using this tool because it seems like a better option than ExportTable,
considering there isn't another HDFS cluster on hand.

It uses the following trick to make exportSnapshot "import" from S3 to the
local HDFS.

            // Override dfs configuration to point to S3
            config.set("fs.default.name", s3protocol + accessKey + ":"
+ accessSecret + "@" + bucketName);
            config.set("fs.defaultFS", s3protocol + accessKey + ":" +
accessSecret  + "@" + bucketName);
            config.set("fs.s3.awsAccessKeyId", accessKey);
            config.set("fs.s3.awsSecretAccessKey", accessSecret);
            config.set("hbase.tmp.dir", "/tmp/hbase-${user.name}");
            config.set("hbase.rootdir", s3Url);

Imports work great, but only when using the s3n:// protocol (which means
and HFile limit of 5GB).
When using the s3:// protocol, I get the following:
13/10/08 13:32:01 INFO mapred.JobClient:     MISSING_FILES=1

The author said he wasn't able to debug it and just uses s3n:// until it
becomes a problem.

Has anyone encountered this when using exportSnapshot?
Can you please point me in the right direction?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message