sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ryan Blue (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-1751) Rearrange LinkConfig and ToJobConfig
Date Fri, 21 Nov 2014 18:17:34 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-1751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221185#comment-14221185

Ryan Blue commented on SQOOP-1751:

What effect does this have on the user? Are they all just config options, or does an administrator
need to set up the dataset URI if it is part of the link data?

It seems like the link data should be used for anything that can be used across jobs or needs
to be hidden. While the dataset URI is like a JDBC URI, it is also much more like a table.
In fact, that's exactly what the Hive URI pattern encodes: {{dataset:hive:<db>/<table>}}.
That doesn't need to be hidden and could be different between jobs. The link information *can*
be encoded in the URI, but it doesn't need to be. For example, we can add the metastore host
and port: {{dataset:hive://ms-host:9083/db/table}}, but we prefer if that comes from the environment,
which looks like a perfect use of the link-level data. I think we should plan on this mode
of operation in the future and keep link information out of Kite URIs when we define new URIs
in Kite (the current ones don't have sensitive link info).

In summary, from [~stanleyxu2005]'s examples, I currently think that the Kite URI is a job

> Rearrange LinkConfig and ToJobConfig
> ------------------------------------
>                 Key: SQOOP-1751
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1751
>             Project: Sqoop
>          Issue Type: Sub-task
>          Components: connectors
>            Reporter: Qian Xu
>            Assignee: Qian Xu
>            Priority: Minor
>             Fix For: 1.99.5
> Regarding Abe's review comments, here are thoughts to rearrange configuration items.
> 1. File Output Format should be specified in ToJobConfig. 
> 2. Corresponding validation should happen on time.
> 3. Split protocol, host and port part to link and leave destination to toJobConfig

This message was sent by Atlassian JIRA

View raw message