sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jarek Jarcec Cecho (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-1055) Add option to export from Hive use HQL query
Date Sat, 29 Jun 2013 20:55:20 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13696195#comment-13696195

Jarek Jarcec Cecho commented on SQOOP-1055:

Hi [~konstantinos],
I'm glad to see your interest in this issue! As it happens this is quite big feature request
that can be achieved multiple ways so there is no simple area of code that we can reference
at the moment.
> Add option to export from Hive use HQL query
> --------------------------------------------
>                 Key: SQOOP-1055
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1055
>             Project: Sqoop
>          Issue Type: Improvement
>            Reporter: Hari Sekhon
> Sqoop currently has a --query option for import but not for export.
> It would be nice if the export --query option supporting HiveQL could be added as users
currently have to create a temporary table and then export that as a two step process with
a full disk re-write of all the to-be-exported data to a new table before the sqoop export
command is started.
> Since Sqoop executes a distributed map-only job, I believe certain queries such as joins
that have to be done via a reduce phase will yield little performance improvement due to the
map->reduce intermediate writes needing to be written anyway. However we could save on
the final reduce phase writes and also turn this in to a more convenient one step instead
two step process.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message