sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jiraposter@reviews.apache.org (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-319) The --hive-drop-import-delims option should accept a replacement string
Date Sat, 20 Aug 2011 00:07:29 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088083#comment-13088083
] 

jiraposter@reviews.apache.org commented on SQOOP-319:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1598/#review1579
-----------------------------------------------------------


Thanks for the patch Joey. A high-level suggestion - please add validation that stops users
from using both the options of --hive-drop-import-delims and the one you are introducing as
they are logically incompatible.

A refactoring suggestion and minor checkstyle comments below.


src/java/com/cloudera/sqoop/lib/FieldFormatter.java
<https://reviews.apache.org/r/1598/#comment3565>

    It will be better to create another method that is called hiveStringReplaceDelims(String,String)
which is called by the original method with replacement string set to empty string.



src/java/com/cloudera/sqoop/orm/ClassWriter.java
<https://reviews.apache.org/r/1598/#comment3566>

    Longer than 80.



src/java/com/cloudera/sqoop/orm/ClassWriter.java
<https://reviews.apache.org/r/1598/#comment3567>

    Longer than 80.



src/java/com/cloudera/sqoop/tool/BaseSqoopTool.java
<https://reviews.apache.org/r/1598/#comment3569>

    Longer than 80.



src/java/com/cloudera/sqoop/tool/BaseSqoopTool.java
<https://reviews.apache.org/r/1598/#comment3568>

    Longer than 80.



src/test/com/cloudera/sqoop/hive/TestHiveImport.java
<https://reviews.apache.org/r/1598/#comment3570>

    Longer than 80.



src/test/com/cloudera/sqoop/hive/TestHiveImport.java
<https://reviews.apache.org/r/1598/#comment3571>

    Longer than 80.


- Arvind


On 2011-08-19 18:52:15, Joey Echeverria wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1598/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-08-19 18:52:15)
bq.  
bq.  
bq.  Review request for Sqoop.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  I added a new option, --hive-delims-replacement, which lets you pass in a replacement
string. I did it with a new option to remain backwards compatible with the existing interface.
bq.  
bq.  
bq.  This addresses bug SQOOP-319.
bq.      https://issues.apache.org/jira/browse/SQOOP-319
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/docs/user/hive-args.txt 7e6b7a0 
bq.    src/docs/user/hive.txt 059d7cb 
bq.    src/java/com/cloudera/sqoop/SqoopOptions.java d760d39 
bq.    src/java/com/cloudera/sqoop/lib/FieldFormatter.java 41536e1 
bq.    src/java/com/cloudera/sqoop/orm/ClassWriter.java dd3994e 
bq.    src/java/com/cloudera/sqoop/tool/BaseSqoopTool.java 8f629f1 
bq.    src/test/com/cloudera/sqoop/hive/TestHiveImport.java 35de2fd 
bq.    testdata/hive/scripts/fieldWithNewlineReplacementImport.q PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/1598/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  I added a unit test for the new option. I also tested the feature by hand. It works,
but I found a bug when doing --direct (at least with MySQL). It doesn't end up calling the
hiveStringDropDelims() function. Some other kind of escaping is going on. I'll file that as
a separate JIRA.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Joey
bq.  
bq.



> The --hive-drop-import-delims option should accept a replacement string
> -----------------------------------------------------------------------
>
>                 Key: SQOOP-319
>                 URL: https://issues.apache.org/jira/browse/SQOOP-319
>             Project: Sqoop
>          Issue Type: Bug
>          Components: hive-integration
>    Affects Versions: 1.3.0
>            Reporter: Joey Echeverria
>            Assignee: Joey Echeverria
>            Priority: Minor
>         Attachments: SQOOP-319-1.patch
>
>
> When importing data into hive, you have the option of dropping the hive delimiters in
data fields. It would be more useful to replace the delimiters with a user defined string.
Often times the dropped delimiters (like \n) are separating words. If I want to split on white
space in my hive queries, I'll now get two words merged together. A more desirable behavior
would be to replace it with a space. Making it user configurable will give the most flexibility.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message