sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anna Szonyi <szo...@cloudera.com>
Subject Re: Review Request 57499: Sqoop incremental import - NULL column updates are not pulled into HBase table
Date Tue, 27 Jun 2017 12:29:06 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57499/#review178977
-----------------------------------------------------------



Hi Jilani,

Thank you for your changes and apologies for the latency! I've reviewed the changes and they
look really good! There is one last comment I am making that I want to ask your thoughts on
and if we resolve that one I will commit your changes as soon as possible.

Thanks,
Anna


src/java/org/apache/sqoop/hbase/ToStringPutTransformer.java
Lines 172-200 (original), 174-217 (patched)
<https://reviews.apache.org/r/57499/#comment253359>

    Hey Jilani,
    
    Instead of this, maybe we could initialize the List at the top of the method and instead
of using the booleans, like:
    
     List<Mutation> mutationList = new ArrayList<>();
      ...
        for (...) {
             ...
            Put put = new Put(Bytes.toBytes(rowKey));
            if (null != val) {
              if ( val instanceof byte[]) {
                put.addColumn(...);
              } else {
                put.addColumn(...);
              }
              mutationList.add(put);
            } else {
              Delete delete = new Delete(Bytes.toBytes(rowKey));
              delete.addColumn(...);
              mutationList.add(delete);
            }
          }
        }
    
        return Collections.unmodifiableList(mutationList);
      }
      
    or even use a Stack/Deque if you want to ensure that there is only a single (last) element
that we return in either case:
    
        Deque<Mutation> mutations = new ArrayDeque<>();
        ...
        for (...) {
          ...
            Put put = new Put(Bytes.toBytes(rowKey));
            if (null != val) {
              if ( val instanceof byte[]) {
                put.addColumn(...);
              } else {
                put.addColumn(...);
              }
              mutations.addFirst(put);
            } else {
              Delete delete = new Delete(Bytes.toBytes(rowKey));
              delete.addColumn(...)
              mutations.addFirst(delete);
            }
          }
        }
    
        return Collections.singletonList(mutations.peekFirst());
        
    Or something similar.
    This way we don't have to track and check the isDelete and isPop, which makes the code
a bit more clean and extensible.
    Please let me know your thoughts on this!


- Anna Szonyi


On June 11, 2017, 8:21 p.m., Jilani Shaik wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57499/
> -----------------------------------------------------------
> 
> (Updated June 11, 2017, 8:21 p.m.)
> 
> 
> Review request for Sqoop, Boglarka Egyed, Attila Szabo, and Anna Szonyi.
> 
> 
> Bugs: SQOOP-3149
>     https://issues.apache.org/jira/browse/SQOOP-3149
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> HBase delete is added as part of incremental data import, Modified the return type of
method which is responsible for holding list of Put objects to List of Mutation objects and
at the time of execution of insert or delete using the type of object in Mutation, executed
the actaul operation either insert or delete.
> 
> Jira ticket for the same: https://issues.apache.org/jira/browse/SQOOP-3149
> 
> Similar ticket to above:  SQOOP-3125
> 
> 
> Diffs
> -----
> 
>   src/java/org/apache/sqoop/hbase/HBasePutProcessor.java fdbe1276 
>   src/java/org/apache/sqoop/hbase/PutTransformer.java 533467e5 
>   src/java/org/apache/sqoop/hbase/ToStringPutTransformer.java 363e1456 
>   src/java/org/apache/sqoop/mapreduce/HBaseBulkImportMapper.java 58ccee7b 
>   src/test/com/cloudera/sqoop/hbase/HBaseImportTest.java fa14a013 
>   src/test/com/cloudera/sqoop/hbase/HBaseTestCase.java a054eb66 
>   src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java 6310a398 
> 
> 
> Diff: https://reviews.apache.org/r/57499/diff/6/
> 
> 
> Testing
> -------
> 
> Executed with jar prepared with these changes into hadoop cluster and tested bot import
and then incremental import.
> 
> 
> File Attachments
> ----------------
> 
> HBase delete support for incremental import
>   https://reviews.apache.org/media/uploaded/files/2017/03/11/5b1895fd-4c6b-42fa-8a92-4e093153e370__hbase_delete_support_in_incremental_import
> updated with unit test and based on below suggestions
>   https://reviews.apache.org/media/uploaded/files/2017/03/22/708f63ba-2d8a-4a47-ab67-c1d2776354fd__hbase_delete_support_in_incremental_unit_test_also
> 
> 
> Thanks,
> 
> Jilani Shaik
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message