sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sandish Kumar HN (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SQOOP-3178) SQOOP PARQUET INCREMENTAL MERGE
Date Wed, 17 May 2017 18:33:04 GMT

     [ https://issues.apache.org/jira/browse/SQOOP-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sandish Kumar HN updated SQOOP-3178:
------------------------------------
    Description: Currently, sqoop-1 only supports merging of two parquet format data sets
but it doesn't support to do incremental merge, so I have written a Sqoop Incremental Merge
MR for Parquet File Format and I have tested with million records of data with N number of
iterations.  (was: Hi. 
I see that there is no parquet Incremental merge, I just took sqoop version 1.4.6 source code
and wrote an MR job for parquet incremental job. Can anyone give me specific instructions
to push parquet Incremental merge code to latest version ??)

> SQOOP PARQUET INCREMENTAL MERGE 
> --------------------------------
>
>                 Key: SQOOP-3178
>                 URL: https://issues.apache.org/jira/browse/SQOOP-3178
>             Project: Sqoop
>          Issue Type: Improvement
>          Components: build, codegen, connectors
>         Environment: None
>            Reporter: Sandish Kumar HN
>            Priority: Critical
>
> Currently, sqoop-1 only supports merging of two parquet format data sets but it doesn't
support to do incremental merge, so I have written a Sqoop Incremental Merge MR for Parquet
File Format and I have tested with million records of data with N number of iterations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message