sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ryan Blue (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-1744) TO-side: Write data to HBase
Date Tue, 02 Dec 2014 23:27:13 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-1744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14232297#comment-14232297

Ryan Blue commented on SQOOP-1744:

You're right that HBase doesn't buy us much. In that situation, where we can't isolate a subset
of the data that might change, I think we have two options: either rewrite the entire dataset
each time or maintain the dataset in HBase. We shouldn't overlook the second option, which
would facilitate the fetch frequency that you want. Parquet is a great format to use, but
if we have to constantly rewrite the entire dataset or very substantial portions of it, then
it might not be worth the storage savings.

> TO-side: Write data to HBase
> ----------------------------
>                 Key: SQOOP-1744
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1744
>             Project: Sqoop
>          Issue Type: Sub-task
>          Components: connectors
>            Reporter: Qian Xu
>            Assignee: Qian Xu
>             Fix For: 1.99.5
> Propose to write data into HBase. Note that different to HDFS, HBase is append only.
Merge does not work for HBase.

This message was sent by Atlassian JIRA

View raw message