hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shi, Shaofeng" <shao...@ebay.com>
Subject Re: Can TableSnapshotInputFormat support multiple snapshots as the MR input?
Date Sat, 23 May 2015 03:11:32 GMT
Hi Andrew, this is what we need, thank you! In which version will this
feature be released? Our hbase is v0.98, is it possible that just patch
this to get the feature?

On 5/22/15, 6:06 PM, "Andrew Mains" <andrew.mains@kontagent.com> wrote:

>In the latest release, no; however I've filed a ticket here
>https://issues.apache.org/jira/browse/HBASE-13356 for this feature, and
>uploaded a patch for review.
>The patch provides a MultiTableSnapshotInputFormat which can run a list
>of scans over multiple snapshots. Jobs can be initialized using:
>  public static void initMultiTableSnapshotMapperJob(Map<String,
>Collection<Scan>> snapshotScans,
>      Class<? extends TableMapper> mapper, Class<?> outputKeyClass,
>Class<?> outputValueClass,
>       Job job, boolean addDependencyJars, Path tmpRestoreDir) throws
>IOException {
>Hope this helps!
>On 5/22/15 2:35 AM, Shi, Shaofeng wrote:
>> Hello,
>> We have a scenario which need merge multiple Hbase tables into one
>>table periodically; To gain better performance and minimal the impact to
>>HBase server, we are evaluating the method of using
>>(http://www.slideshare.net/enissoz/mapreduce-over-snapshots); But from
>>the API we see it only allows one snapshot as input; Is it possible to
>>change it to allow multiple snapshots?
>> Thanks in advance for any advise;
>> Shaofeng Shi
>> Apache Kylin

View raw message